Over the past week,
@arcinstitute published three new discoveries that I’m very proud of.
• The world's first functional AI-generated genomes. Using Evo 2 (the largest biology ML model ever trained, which Arc released in partnership with
@nvidia in February), Arc scientists took advantage of the fact that Evo 2 is a generative model to produce completely new sequences for complete phage genomes. That is, they used AI to produce wholly new, never-before-seen-by-nature genomes. They experimentally synthesized these genomes and showed that these AI-generated phages actually work, killing E. coli bacteria with high efficacy.
• Germinal, an AI system for creating new antibodies. Antibody design is one of the great problems of medical biology given their obvious importance and usefulness for creating therapeutics. (Antibodies are tiny particles that help the immune system identify pathogens and other harmful intruders. See also the recent Works in Progress article on this topic: [1].) Today, designing effective antibodies is very expensive and slow. Germinal is a cheap and fast way to produce drug candidates, with success rates of up to 22%. This means that one can go from having to screen thousands of candidates in the lab to screening perhaps a few dozen. It's early, but I suspect that better methods for designing antibodies will be a very big deal for disease treatment in the coming years.
• Today, we published a paper showing that “bridge editing”, which Arc scientists first introduced last year, can make precise edits in human cells that are up to 1 million base pairs long, and without relying on intrinsically unpredictable cellular repair machinery (which CRISPR requires, often leading to editing mistakes). They showed that it’s possible to use this editing to cut out the DNA repeats that cause Friedreich’s ataxia (a neurological disease), an approach which should also be relevant to Huntington’s and other similar disorders. One particularly cool thing about it is that it’s possible to specify every nucleotide within the extended editing window, meaning that recursive bridge edits could potentially be a powerful way to reprogram even biological traits that are caused by many genetic mutations. (Genetic therapies today target single mutations.)
Arc is pretty new. Its doors opened in mid 2022, and it's now 300 people. I’m excited about these discoveries because they show that a number of our hopes in starting Arc are starting to pay off:
• AI/ML and computation are at the center of all three. That is obviously true for the first two, but the mobile genetic element behind bridge editing was also discovered as a result of a complex computational search. One of our premises in starting Arc was the belief that the intersection of software/AI and experimental wet lab biology should enable great things. (And besides requiring great computational work, all three of these also required strong wet lab work, tightly coordinated under a single physical roof.)
• We’ve been toying with the idea that a handful of technologies are enabling a new kind of “Turing loop” in biology: sequencing advances (including single-cell sequencing) give us new ways to read; transformers and AI gives us new ways to think; and functional genomics (such as bridge editing) give us new ways to ways to write. This trio of discoveries span each part of this loop, and we’re hopeful that there’ll be compounding returns in improving each part.
• Arc is a non-profit, which we hoped would make collaborating with others easier, since we can avoid worries about financial return. This is indeed proving important, and all three of these projects involved close partnership with others. Germinal was done in partnership with
@SynBioGaoLab at Stanford; Evo 2 was trained in partnership with Nvidia. Bridge editing was jointly published with a structure from the
@HNisimasu Lab at the University of Tokyo. Arc tries to make its discoveries useful (see the Evo 2 Designer[2]) for others, and the code behind the computational projects is open source, hopefully making it easy for others to spot new opportunities for collaboration and partnership in the future. Most of all, Arc itself is an ongoing collaboration with
@UCSF,
@UCBerkeley, and
@Stanford.
• With Arc, we wanted to enable better bottom-up and top-down work. With the fully flexible, no-strings-attached funding that we provide to investigators, we want to enable completely unexpected discoveries and avenues of investigation. With our institute initiatives (around creating a virtual cell and curing Alzheimer’s), we want to bring to bear a scale and level of coordination that’s usually difficult in basic science. Germinal is a “surprise” discovery that didn’t involve top-down coordination, whereas Evo 2 is the result of ambitious high-level planning and funding.
• Humanity has never cured a complex disease (a category that includes most neurodegenerative diseases, most cancers, and most autoimmune diseases), and my hope is that Arc can help change this. It’s also clear that AI will revolutionize biology, and I hope that Arc can effectively aggregate the ingredients needed to fully capitalize on its promise. I’m biased, but I think some of the coolest biology in the world is currently being done at Arc. (They’re always hiring if you’re interested.)
While I’m a cofounder of Arc, I spend almost all my time on Stripe, where we spend our time building economic infrastructure for the internet. All credit for Arc’s progress should go to the remarkable scientists and staff who’ve made Arc their home or who’ve chosen to collaborate with us. (You can read more about these particular discoveries in these threads: [3], [4], [5].) I’m also very grateful to the amazing Stripe employees who’ve built the company that makes Arc’s ongoing work possible, and to the millions of customers who’ve chosen to partner with Stripe. John and I feel fortunate to be able to support Arc’s work to the extent that we do.
Maybe this is reading too much into it, but I sometimes feel that there’s a commonality between
@arcinstitute and
@stripe. Both biology and economic infrastructure involve reasoning about complex systems with many levels of emergent effects, and in both cases building the right tools can have almost unboundedly large benefits. Even though progress in both tends to take a long time, it also feels like the next five years in both will be some of the most interesting in living memory.
(If economic infrastructure is your jam, we have a whole slew of fantastic announcements coming up at Stripe Tour in New York next week. Tune in!)