Can't help feeling the putative world understanding won't transfer to reasoning.
Purely vibes-based opinion tho.
Even the 1M context LLMs are amazing boilerplate generators, summarizers, and fact-extractors, but such poor reasoners.
With training on web scale data, mimicry can look quite impressive?
If a fraction of this stuff transfers to reasoning, that's gonna be complicated. And I cannot see why it should be far.