The paper behind Kosmos. An AI scientist that runs long, parallel research cycles to autonomously find and verify discoveries. One run can coordinate 200 agents, write 42,000 lines of code, and scan 1,500 papers. A shared world model stores facts, results, and plans so agents stay in sync. Given a goal and dataset, it runs analyses and literature searches in parallel and updates that model. It then proposes next tasks and repeats until it writes a report with traceable claims. Experts judged 79.4% of statements accurate and said 20 cycles equals about 6 months of work. Across 7 studies, it reproduced unpublished results, added causal genetics evidence, proposed a disease timing breakpoint method, and flagged a neuron aging mechanism. It needs clean, well labeled data, can overstate interpretations, and still requires human review. Net effect, it scales data driven discovery with clear provenance and steady context across fields. ---- Paper – arxiv. org/abs/2511.02824 Paper Title: "Kosmos: An AI Scientist for Autonomous Discovery"
📈 Edison Scientific launched Kosmos, an autonomous AI researcher that reads literature, writes and runs code, tests ideas. Compresses 6 months of human research into about 1 day. Kosmos uses a structured world model as shared memory that links every agent’s findings, keeping work aligned to a single objective across tens of millions of tokens. A run reads 1,500 papers, executes 42,000 lines of analysis code, and produces a fully auditable report where every claim is traceable to code or literature. Evaluators found 79.4% of conclusions accurate, it reproduced 3 prior human findings including absolute humidity as the key factor for perovskite solar cell efficiency and cross species neuronal connectivity rules, and it proposed 4 new leads including evidence that SOD2 may lower cardiac fibrosis in humans. Access is through Edison’s platform at $200/run with limited free use for academics. There are caveats since runs can chase statistically neat but irrelevant signals, longer runs raise this risk, and teams often launch multiple runs to explore different paths. Beta users estimated 6.14 months of equivalent effort for 20 step runs, and a simple model based on reading time and analysis time predicts about 4.1 months, which suggests output scales with run depth rather than hitting a fixed ceiling.

Nov 7, 2025 · 2:18 AM UTC

Replying to @rohanpaul_ai
Wow, Rohan, that's some serious AI power! Imagine the possibilities, no?
Replying to @rohanpaul_ai
It’s a concrete example of how world models aren’t just abstract AI concepts, but practical tools for accelerating scientific insight and especially in domains where integrating literature, experiments, and causal reasoning is key.
Replying to @rohanpaul_ai
Really impressive, this shows the power of a shared world model in coordinating complex reasoning across multiple agents. By maintaining a centralized, evolving representation of facts, results, and plans, Kosmos can scale discovery while keeping context consistent and traceable
1
Replying to @rohanpaul_ai
6 months to 1 day is not optimization, it's an exponential singularity. The real lesson: 200 agents are useless noise without a coherent, shared world model. That's the only path to AGI-scale discovery.
Replying to @rohanpaul_ai
Interesting there is no reference to Virtuous Machines: Towards Artificial General Science