THE BRIEFING
A common fear about AI in science is that it will narrow the frontier - that models trained on existing data will herd researchers toward the familiar. This issue's stories suggest the opposite is happening.
☑️ A new analysis of nearly 250,000 protein structures shows that AlphaFold reversed a long-running decline in scientific novelty, redirecting researchers toward proteins no one had studied before.
☑️ DISCO, from Frances Arnold's lab, designed working enzymes for chemistry that no living organism has ever performed - with active sites that have no match anywhere in the known protein universe.
☑️ The OpenAI Foundation is committing over $100 million to Alzheimer's, a disease that has defeated over 100 attempts in clinical trials, using AI to map the combinatorial complexity that traditional approaches couldn't navigate.
☑️ And a new initiative is funding the datasets the field still lacks - because models have outpaced the data they need.
Far from just accelerating what scientists already do, AI is also changing what they attempt.
Let's dive in.
AD
The Future of AI in Marketing. Your Shortcut to Smarter, Faster Marketing.
Unlock a focused set of AI strategies built to streamline your work and maximize impact. This guide delivers the practical tactics and tools marketers need to start seeing results right away:
7 high-impact AI strategies to accelerate your marketing performance
Practical use cases for content creation, lead gen, and personalization
Expert insights into how top marketers are using AI today
A framework to evaluate and implement AI tools efficiently
Stay ahead of the curve with these top strategies AI helped develop for marketers, built for real-world results.
NEWS
OpenAI Foundation commits over $100 million to crack Alzheimer's

Researchers at the Institute for Protein Design reviewing a computationally generated protein structure. Credit: Ian Haydon/OpenAI Foundation
More than $100 million in grants are going to six research institutions this month to build what may be the most ambitious AI-driven attack on Alzheimer's disease yet. The OpenAI Foundation - the nonprofit arm we covered in Issue 11 when it pledged $1 billion for life sciences - is funding a five-layer research stack that spans from brain tissue models to drug design to clinical biomarkers.
At the center is Arc Institute and its “lab in the loop.” The concept: build three-dimensional brain tissue models containing neurons, microglia, and astrocytes, then systematically hit them with genetic and immune challenges drawn from patient data. Measure what happens. Feed those results into causal AI models that learn which disease pathways converge. Then let the AI decide what to perturb next. Each cycle produces a sharper map of where the disease is vulnerable - and where a drug might actually work. At least that’s the idea.
Nobel laureate David Baker's Institute for Protein Design at the University of Washington designs candidate molecules against targets critical to Alzheimer's progression. EvE Bio creates open datasets for predicting drug activity. Other layers target new biomarkers for earlier diagnosis and clinical trials with off-patent treatments - like lithium orotate - where pharma has no incentive to fund studies.
Arc co-founder Patrick Hsu frames the difficulty: “Alzheimer’s has resisted treatment in part because it is the quintessential complex disease. It’s the result of hundreds of genetic and environmental risk factors interacting across cell types over decades.“
Why it matters: Alzheimer's has been one of medicine's most stubborn failures. What's different now is the convergence: a Nobel laureate's protein design tools, causal AI that learns from each experimental cycle, and platforms that can run perturbations at a scale manual research never could. Whether $100 million and AI can succeed where $100 billion and traditional approaches largely haven't is the question this initiative will start to answer.
Did you know? Arc Institute is hiring across its Alzheimer's Disease Initiative.
NEWS
An AI designed enzymes for chemistry that nature never invented - and they worked

Enzymes are nature's chemical factories - proteins that drive reactions with extraordinary precision. But the chemistry evolution explored is only a tiny fraction of what's possible. There are countless useful reactions that enzymes could theoretically perform - biology just never needed them, so no enzyme exists to do the job.
DISCO (DIffusion for Sequence-structure CO-design, a multimodal generative model), from a multi-institutional team spanning Caltech, Mila, and seven other institutions - led by Nobel laureate Frances Arnold and with Yoshua Bengio among the co-authors, changes that equation. Give it the target reaction chemistry - not a starting protein, not a blueprint for the active site, just the reaction itself - and it designs a working enzyme from scratch. Sequence and three-dimensional structure emerge together in a single generation step, which matters because a protein's function depends on both at once. Previous methods designed these separately, losing information at the handoff.
The team tested DISCO on carbene-transfer reactions - a class of chemistry that no organism evolved to perform, but valuable for building pharmaceutical compounds. From 90 designs tested in the lab, multiple worked across every reaction the team tried.
The standout: on one of organic chemistry's hardest problems - selectively modifying one specific bond in a molecule full of near-identical ones - a single DISCO design matched what previously took 14 rounds of laboratory evolution. On a different reaction, it more than doubled the best result from three rounds of evolution.
What makes this particularly noteworthy is that DISCO didn't copy nature. The active sites it designed - the molecular pockets where the chemistry happens - have no close match among the 200 million-plus structures in the AlphaFold database. The model figured out how to make this chemistry work using protein architectures that evolution never associated with it.
Why it matters: Arnold won the 2018 Nobel Prize for directed evolution - the painstaking process of breeding better enzymes generation by generation. Her lab's AI can now design enzymes that rival or exceed what that process produces, for reactions where no natural starting point exists. What previously required a starting enzyme and 14 rounds of laboratory evolution, DISCO achieved with one round of computation and a single experiment. If this generalizes, drug molecules that currently require harsh industrial chemistry could be manufactured by custom enzymes - designed computationally rather than evolved over months in the lab.
Did you know? DISCO is open-source - code on GitHub and model weights on Hugging Face.
NEWS
A new initiative funds the datasets AI biology is missing

Let’s fill in the gaps. Credit: NanoBanana2
AI models for biology keep getting bigger, but the data they train on hasn't kept pace. Renaissance Philanthropy - a US-based science philanthropy - partnered with the UK government's Department for Science, Innovation and Technology to find and fund the dataset projects that could change that. They received 79 proposals, and a committee including representatives from Google DeepMind, UK Biobank, and the Wellcome Trust selected ten winners. Each addresses a specific data gap holding the field back.
Many are directly relevant to AI × biology: measuring protein binding affinity at the scale AI needs. Mapping therapeutic peptide properties so AI can predict not just whether a drug binds, but whether it can be manufactured. Building a 50TB ground-truth image library for auditing AI in bioimaging. Cataloguing cell-surface targets for precision drug delivery. Sequencing 20,000 species to train the next generation of DNA language models.
The winners are now preparing for larger follow-on funding. The datasets themselves don't exist yet - these are funded plans, not finished resources.
Why it matters: As Martin Borch Jensen of Gordian Biotechnology and Norn Group noted, AI accelerating biology will be a jagged frontier driven by data availability. Models have outpaced the datasets they need. These projects target exactly that gap.
Did you know? The initiative's framing draws on a precedent: the Protein Data Bank - an openly accessible repository - was the public dataset that enabled AlphaFold.
NEWS
AlphaFold reversed a long-running decline in scientific novelty

For years, structural biologists were studying increasingly familiar proteins. The rate at which researchers targeted novel proteins - ones no one had studied before - had been declining steadily. A new analysis of 245,396 experimental structures in the Protein Data Bank shows that AlphaFold2's release in 2021 halted that decline.
Before AlphaFold, the novelty rate dropped 1.2 percentage points per year. After its release, the decline stopped. By late 2024, the rate of novel protein structures was 4.4 percentage points higher than the pre-existing trend would have predicted. The shift wasn't just correlated with AlphaFold's release - it looks like it was driven by it. Papers that actually cited AlphaFold targeted 38% more novel protein sequences than those that didn't.
The effect rippled outward. Across 248,191 downstream papers that build on structural knowledge, engagement with genes that had no experimental structure rose 22% relative to the pre-AlphaFold baseline. Research on understudied human genes increased 29%.
The authors frame this as “frontier-expanding”: AlphaFold's biggest informational gains came precisely where prior knowledge was thinnest, making exploration less risky rather than reinforcing what was already known.
Why it matters: A common concern about AI in science is that models trained on existing data will narrow research to well-trodden ground. This paper provides field-level evidence of the opposite - at least for AlphaFold.
Did you know? AlphaFold won the 2024 Nobel Prize in Chemistry. Isomorphic Labs - the drug discovery company that spun off from DeepMind - has since gone further with IsoDDE, a full drug design engine that predicts binding affinity and can identify novel druggable pockets on a protein from sequence alone. We covered it in Issue 1.
THE EDGE
The European Molecular Biology Laboratory published a field report from the EMBO/EMBL Symposium on AI and Biology, held in Heidelberg in March - and it's a good overview of where AI × biology stands right now. Seven themes emerge: the push to make AI models explain their reasoning rather than just output predictions. New tools searching the “dark matter” of uncharacterized proteins. Virtual cells and digital organisms. AI moving into the clinic. And a recurring lesson BAIO readers will recognize: the data you feed AI matters as much as the model itself. Worth a few minutes if you want a snapshot of the field's current frontier.
ON OUR RADAR
Until next time,
Peter at BAIO



