Mayk Caldas · May 11, 2018 · 10:46 PM UTC

Mayk Caldas

Pinned Tweet

Mayk Caldas @maykcaldas

11 May 2018

"The dreams in which I'm dying are the best I've ever had. I find it hard to tell you, I find it hard to take. When people run in circles it's a very very mad world"

Calvin and Hobbes Fan Account

@Calvinn_Hobbes

11 May 2018

Some things in life really don’t make any sense…

Sam Rodriques · Nov 6, 2025 · 2:35 PM UTC

Mayk Caldas retweeted

Sam Rodriques

@SGRodriques

Nov 6

Our older agents, like Crow, Phoenix, and HasAnyone, are still available on our platform as Literature, Molecules, and Precedent, respectively, for 1-2 credits per run. We will be launching more powerful versions soon! (Falcon, our deep research agent, has merged with Crow.)

Sam Rodriques · Nov 5, 2025 · 3:00 PM UTC

Mayk Caldas retweeted

Sam Rodriques

@SGRodriques

Nov 5

Kosmos, our newest AI Scientist, is available to use today on our platform. Watch here as three of our scientists describe what Kosmos is, and how it can accelerate scientific research.

510

Edison Scientific, Inc · Nov 5, 2025 · 3:12 PM UTC

Mayk Caldas retweeted

Edison Scientific, Inc @EdisonSci

Nov 5

Edison Scientific is launching today, with our launch of Kosmos, the most powerful AI Scientist released yet. We are a spinout from FutureHouse, focused on building and commercializing AI agents for science. Try our newest agents, including Kosmos, on our platform today.

337

Tony Kulesa · Nov 5, 2025 · 3:17 PM UTC

Mayk Caldas retweeted

Tony Kulesa

@kulesatony

Nov 5

More on Kosmos from some of the team behind it here. And check out the technical report: edisonscientific.com/kosmos-…

Kosmos: An AI Scientist for Autonomous Discovery

Data-driven scientific discovery requires iterative cycles of literature search, hypothesis generation, and data analysis. Substantial progress has been made towards AI agents that can automate...

arxiv.org

Sam Rodriques

@SGRodriques

Nov 5

Kosmos, our newest AI Scientist, is available to use today on our platform. Watch here as three of our scientists describe what Kosmos is, and how it can accelerate scientific research.

Sam Rodriques · Nov 5, 2025 · 3:47 PM UTC

Mayk Caldas retweeted

Sam Rodriques

@SGRodriques

Nov 5

Sam Rodriques

@SGRodriques

Nov 5

Kosmos, our newest AI Scientist, is available to use today on our platform. Watch here as three of our scientists describe what Kosmos is, and how it can accelerate scientific research.

Sam Rodriques · Nov 5, 2025 · 3:00 PM UTC

Mayk Caldas retweeted

Sam Rodriques

@SGRodriques

Nov 5

Today, we’re announcing Kosmos, our newest AI Scientist, available to use now. Users estimate Kosmos does 6 months of work in a single day. One run can read 1,500 papers and write 42,000 lines of code. At least 79% of its findings are reproducible. Kosmos has made 7 discoveries so far, which we are releasing today, in areas ranging from neuroscience to material science and clinical genetics, in collaboration with our academic beta testers. Three of these discoveries reproduced unpublished findings; four are net new, validated contributions to the scientific literature. AI-accelerated science is here. Our core innovation in Kosmos is the use of a structured, continuously-updated world model. As described in our technical report, Kosmos’ world model allows it to process orders of magnitude more information than could fit into the context of even the longest-context language models, allowing it to synthesize more information and pursue coherent goals over longer time horizons than Robin or any of our other prior agents. In this respect, we believe Kosmos is the most compute-intensive language agent released so far in any field, and by far the most capable AI Scientist available today. The use of a persistent world model also enables single Kosmos trajectories to produce highly complex outputs that require multiple significant logical leaps. As with all of our systems, Kosmos is designed with transparency and verifiability in mind: every conclusion in a Kosmos report can be traced through our platform to the specific lines of code or the specific passages in the scientific literature that inspired it, ensuring that Kosmos’ findings are fully auditable at all times. We are also using this opportunity to announce the launch of Edison Scientific, a new commercial spinout of FutureHouse, which will be focused on commercializing our agents and applying them to automate scientific research in drug discovery and beyond. Edison will be taking over management of the FutureHouse platform, where you can access Kosmos alongside our Literature, Molecules, and Precedent agents (previously Crow, Phoenix, and Owl). Edison will continue to offer free tier usage for casual users and academics, while also offering higher rate limits and additional features for users who need them. You can read more about this spinout on our blog, below. A few important notes if you’re going to try Kosmos. Firstly, Kosmos is different from many other AI tools you might have played with, including our other agents. It is more similar to a Deep Research tool than it is to a chatbot: it takes some time to figure out how to prompt it effectively, and we have tried to include guidelines on this to help (see below). It costs $200/run right now (200 credits per run, and $1/credit), with some free tier usage for academics. This is heavily discounted; people who sign up for Founding Subscriptions now can lock in the $1/credit price indefinitely, but the price ultimately will probably be higher. Again, this is less chatbot and more research tool, something you run on high-value targets as needed. Some caveats are also warranted. Firstly, we find that 80% of Kosmos findings are reproducible, which also means 20% are not -- some things it says will be wrong. Also, Kosmos certainly does produce outputs that are the equivalent to several months of human labor, but it also often goes down rabbit holes or chases statistically significant yet scientifically irrelevant findings. We often run Kosmos multiple times on the same objective in order to sample the various research avenues it can take. There are still a bunch of rough edges on the UI and such, which we are working on. Finally, we are aware that the 6 month figure is much greater than estimates by other AI labs, like METR, about the length of tasks that AI Agents can currently perform. You can read discussion about this in our blog post. Huge congratulations to our team that put this together, led by @ludomitch and @michaelathinks: Angela Yiu, @benjamin0chang, @sidn137, Edwin Melville-Green, Albert Bou, @arvissulovari, Oz Wassie, @jonmlaurent. A particular shout out to @m_skarlinski and his team that rebuilt the platform for this launch, especially Andy Cai @notAndyCai, Richard Magness, Remo Storni, Tyler Nadolski @_tnadolski, Mayk Caldas @maykcaldas, Sam Cox @samcox822 and more. This work would not have been possible without significant contributions from academic collaborators @mathieubourdenx, @EricLandsness, @bdanubius, @physicistnevans, Tonio Buonassisi, @BGomes_1905, Shriya Reddy, @marthafoiani, and @RandallBateman3. We also want to thank our numerous supporters, especially @ericschmidt, who has been a tremendous ally. We will have more to say about our supporters soon!

196

636

183

3,601

FutureHouse · Aug 25, 2025 · 4:05 PM UTC

Mayk Caldas retweeted

FutureHouse

@FutureHouseSF

Aug 25

Our platform is live again!

FutureHouse

@FutureHouseSF

Aug 24

The FutureHouse platform has been down since Friday afternoon and will likely be down until Tuesday, due to an upstream service provider outage. We have an update banner on the platform page and will have more to share soon.

Michael Skarlinski · Aug 11, 2025 · 3:58 PM UTC

Mayk Caldas retweeted

Michael Skarlinski

@m_skarlinski

Aug 11

Reach out if you'd like to join our amazing platform team!!

Sam Rodriques

@SGRodriques

Aug 11

We are looking to hire an outstanding UI/UX designer with strong front-end engineering skills to reimagine how researchers can make discoveries in collaboration with AI. If you have these skills and want to help AI accelerate science, get in touch.

FutureHouse · Jun 9, 2025 · 4:06 PM UTC

Mayk Caldas retweeted

FutureHouse

@FutureHouseSF

Jun 9

Thanks @BiopharmaTrend for highlighting our chemistry scientific reasoning model ether0!

Sam Rodriques · Jun 5, 2025 · 4:03 PM UTC

Mayk Caldas retweeted

Sam Rodriques

@SGRodriques

Jun 5

Today we are releasing ether0, our first scientific reasoning model. We trained Mistral 24B with RL on several molecular design tasks in chemistry. Remarkably, we found that LLMs can learn some scientific tasks more much data-efficiently than specialized models trained from scratch on the same data, and can greatly outperform frontier models and humans on those tasks. For at least a subset of scientific classification, regression, and generation problems, post-training LLMs may provide a much more data-efficient approach than traditional machine learning approaches. 1/n

228

1,352

Mayk Caldas · Jun 5, 2025 · 3:51 PM UTC

Mayk Caldas @maykcaldas

Jun 5

Ship, ship, ship!

Andrew White 🐦‍⬛

@andrewwhite01

Jun 5

At FutureHouse, we’ve noticed scientific agents are good at applying average intelligence across tasks. They always seem to make the obvious choices, which is good, but discovery sometimes requires more intuition and insight than average. We’ve made the first step today towards superhuman insight by training a reasoning model for a specific domain of science: designing drug-like molecules. We’re releasing a 24B open-weights reasoning model called 𝚎𝚝𝚑𝚎𝚛𝟶. 𝚎𝚝𝚑𝚎𝚛𝟶 has been trained with reinforcement learning to exceed frontier and human experts across a range of molecular design tasks. 𝚎𝚝𝚑𝚎𝚛𝟶 takes in natural language, reasons in English, and outputs a new molecule. 𝚎𝚝𝚑𝚎𝚛𝟶 is now a tool for our chemistry design agent, Phoenix, which can call upon it to design molecules. Training a reasoning model for a scientific domain like chemistry, rather than math or programming, required a number of small technical advances. For example, we developed an iterative method of split specialist models and aggregation of reasoning traces. Another example is we used LLMs to rewrite questions that were partially solved. A major finding from this work is that we can train with >10x efficiency per experimental measurement when using a reasoning model, rather than fine-tuning. We also found that reasoning models can learn new tasks, developed specifically for this paper and not in pretraining corpora. We even saw a task have 0% performance until 100 steps into RL, at which it randomly solved once. This, along with our change in modality from natural language to molecules, bodes well for applying reasoning models far from natural language. Reasoning models in science are the future. Scientific tasks are naturally verifiable rewards: the physical world is the ultimate arbiter of accuracy, rather than human contractors. The data efficiency gain and ability to exceed frontier models with relatively few parameters/compute mean that we should expect more scientific reasoning models soon. Congrats to team @SidN137, James, @Ryan__Rhys, Albert, @GWellawatte , @maykcaldas , @ludomitch , and @SGRodriques. Thanks to @VoltagePark @nvidia and @huggingface for supporting us, and huge thanks to @ericschmidt for funding @FutureHouseSF The model weights, reward model, and new benchmark are open source. You can also read more about scientific reasoning models in our exclusive with Nature.

Andrew White 🐦‍⬛ · Jun 5, 2025 · 3:46 PM UTC

Mayk Caldas retweeted

Andrew White 🐦‍⬛

@andrewwhite01

Jun 5

413

Michael Skarlinski · Jun 2, 2025 · 5:44 PM UTC

Mayk Caldas retweeted

Michael Skarlinski

@m_skarlinski

Jun 2

The FutureHouse platform now has a public documentation repository for raising issues and sharing demos. Please open any issues you find with our API client here! (link in reply)

Kevin M Jablonka · May 20, 2025 · 3:55 PM UTC

Mayk Caldas retweeted

Kevin M Jablonka @kmjablonka

May 20

A project we started a long time ago with @MichaelPieler, @pschwllr and many others is now on Arxiv: The ChemPile - a massive dataset for training chemical foundation models.

Jablonka Lab (Lab for AI for Materials) @jablonkagroup

May 20

Training large language models for chemistry is bottlenecked by one critical problem: there is no unified dataset that connects all chemical domains.

Sam Rodriques · May 20, 2025 · 3:12 PM UTC

Mayk Caldas retweeted

Sam Rodriques

@SGRodriques

May 20

Read the paper here: arxiv.org/abs/2505.13400 and our blog post: futurehouse.org/research-ann…

Robin: A multi-agent system for automating scientific discovery

Scientific discovery is driven by the iterative process of background research, hypothesis generation, experimentation, and data analysis. Despite recent advancements in applying artificial...

arxiv.org

219

Sam Rodriques · May 20, 2025 · 3:12 PM UTC

Mayk Caldas retweeted

Sam Rodriques

@SGRodriques

May 20

Today, we’re announcing the first major discovery made by our AI Scientist with the lab in the loop: a promising new treatment for dry AMD, a major cause of blindness. Our agents generated the hypotheses, designed the experiments, analyzed the data, iterated, even made figures for the paper. The resulting manuscript is a first-of-a-kind in the natural sciences, in which everything that needed to be done to write the paper was done by AI agents, apart from actually conducting the physical experiments in the lab and writing the final manuscript. We are also introducing Robin, the first multi-agent system that fully automates the in-silico components of scientific discovery, which made this discovery. This is the first time that we are aware of that hypothesis generation, experimentation, and data analysis have been joined up in closed loop, and is the beginning of a massive acceleration in the pace of scientific discovery that will be driven by these agents. We will be open-sourcing the code and data next week. Robin is a multi-agent system that uses Crow, Falcon, and Finch, the agents on our platform, to generate novel hypotheses, plan experiments, and analyze data. We asked Robin to find a new treatment for dry age-related macular degeneration. Robin considered the disease mechanisms associated with dry AMD, proposed a specific experimental assay that could be used to evaluate hypotheses in the wet lab, and proposed specific molecules we could test in that assay. We tested the molecules and gave it the resulting data, which it analyzed before proposing more experiments. In the end, it identified Ripasudil, a Rho Kinase inhibitor (ROCK inhibitor) that is approved in Japan for several other diseases, which seems very promising as potential treatment for dry AMD. It also identified specific molecular mechanisms that might underlie the effects of Ripasudil in RPE cells, from an RNA sequencing experiment it proposed. To be clear, no one has proposed using ROCK inhibitors to treat dry AMD in the literature before, as far as we can find, and I think it would have been very difficult for us to come up with this hypothesis without the agents. We have also run the proposed treatment by several experts in AMD, who confirm that it is interesting and novel. Moreover, this project was fast: with Robin in hand, the entire project took about 10 weeks, which is way shorter than it would have taken if we had been doing all of the in-silico components ourselves. Important caveats: We are real biologists at FutureHouse, so I want to be clear that although the discovery here is exciting, we are not claiming that we have cured dry AMD. Fully validating this hypothesis as a treatment for dry AMD will take human trials, which will take much longer. Also, this discovery is cool, but it is not yet a "move 37"-style discovery. At the current rate of progress, I'm sure we will get to that level soon. Congratulations to the team. Congratulations in particular to Robin, which generated the hypotheses, proposed the experiments, analyzed the data and generated the figures. And major congratulations also to the human team, which built Robin: @MichaelaThinks, @agreeb66, @benjamin0chang, @ludomitch, Mo Razzak, Kiki Szostkiewicz, and Angela Yiu.

115

706

220

3,675

FutureHouse · May 13, 2025 · 3:04 PM UTC

Mayk Caldas retweeted

FutureHouse

@FutureHouseSF

May 13

We're thrilled to announce the 2025 FutureHouse AI for Science Independent Postdoctoral Fellows—six extraordinary scientists using FutureHouse tools to solve challenging scientific problems. Fellows will have access to all released and non-released agents and tools we’ve built at FutureHouse, including our new platform, internal hypothesis generation agents, protein design agents, our Aviary reinforcement learning framework, and dedicated in-house engineering support and compute budget.

190

Mayk Caldas · May 13, 2025 · 12:14 AM UTC

Mayk Caldas @maykcaldas

May 13

Quickly typing is a fun thing. I opened a new tab to go to HugginFace. Noticed I made a typo and quickly opened another new tab and went to HuggingFace. Now, a few minutes later, I found a google search on "hugh jackman" and was very confused with why I googled about wolverine

Andrew White 🐦‍⬛ · May 6, 2025 · 3:16 PM UTC

Mayk Caldas retweeted

Andrew White 🐦‍⬛

@andrewwhite01

May 6

We're launching an agent that can do bioinformatics analysis, including repeating analysis from research papers. It is multimodal and results in a complete jupyter notebook (python or R) that ends in a concrete conclusion. Starting with closed-beta now

Sam Rodriques

@SGRodriques

May 6

Introducing Finch, a new agent that fully automates data-driven discovery in biology. We are launching a closed beta for it today (sign up below). This is still early, but impressive, maybe similar to a good 1st yr grad student. In the video, see how it independently reproduces key findings from the Golub Lab's 2020 MetMap paper, including the fact that ADAM28 deletions are associated with breast cancer metastases to brain (fig 4b of the original paper). It also identifies several novel findings not already in the paper, like associations with EFNA5 and PTCH1 amplifications. Importantly, the prompt here is fully open-ended! We just ask the agent to explore the data. Similar to a first year grad student, it makes a bunch of silly mistakes, but also actually ends up finding some really cool stuff. And it works really fast by comparison... 1/3

423

Sam Rodriques · May 6, 2025 · 3:10 PM UTC

Mayk Caldas retweeted

Sam Rodriques

@SGRodriques

May 6

107

590