In partnership with

THE BRIEFING

Greetings from Vitalist Bay in Berkeley, where I'm reporting from this week. Vitalist Bay is a longevity conference that increasingly sees AI×bio as one of the most important accelerants in the mission to solve aging.

That's why two of this issue's items focus on AI for aging specifically: a writeup of Morgan Levine's talk at the conference - she's VP of Computation at Altos Labs - and three things I learned from Gordian Bio’s Martin Borch Jensen's recent essay.

The conference also means a slightly different format this week, with a few more in-depth items, and a later send than usual. I don't take delays lightly. Apologies.

Alright, let's dive in.

PGA and LPGA Winners Already Invested. 12 Angel Groups Too.

AI has created some of the biggest investment opportunities of the decade. Sparrow is bringing that shift into human performance - a $1T+ untapped market.

Sparrow turns your smartphone into a real-time AI coach, starting with golf and expanding into all forms of human motion.

85% revenue growth. 250K users. PGA and LPGA winners already invested.

Invest by 5/31 and receive 10% bonus shares.

Opportunities like this don’t stay open long: Invest in Sparrow now.

^{This is a paid advertisement for Sparrow's Regulation CF offering. Please read the offering circular at}^{invest.sparrowup.com}^.

FIRSTHAND
Morgan Levine: stop building the wrong AI for aging

Did Morgan Levine just put BAIO front and center on one of her slides? Yes, she did. More on that below.

You may not be at the longevity conference Vitalist Bay in sunny Berkeley, California, this week. But I am, so I can bring some of it to you.

Specifically, I want to share with you Morgan Levine’s talk, titled “The road to longevity lies in the virtual worlds we need to build.”

Levine - the VP of Computation at Altos Labs - argued that almost everything currently called “AI for longevity” is the wrong kind of AI for the problem.

Aging, Levine said, is the dysregulation of the most complex system we know - the body. It has no single lever. So science's reductionist habit, which works for an infectious disease, “isn't going to make a dent” here.

What can deal with that complexity? Two things, she argued: evolution, and AI. Both are black-box optimizers that can navigate problems whose mechanisms we don't fully understand.

But not just any AI. Levine walked through three approaches she thinks won't get us there.

One: Aging clocks, always a hot topic in longevity science, are useful read-outs but don't help discover new interventions; they predict a proxy, not aging itself. It bears mentioning that Morgan Levine is a pioneer in the space, having developed the PhenoAge clock in 2018.

Two: Task-specific models - virtual cells, protein folding - learn one narrow domain of biology. “I could have the best protein structure model possible”, Levine said, but “it's not going to tell me that if I put this drug in a human, what that's going to do for their kidney”.

Three: Agentic systems, while powerful, are trained on filtered, and in this domain very limited, human knowledge. “They're going to massively scale up the way we do biology,” she said. “But they're not going to discover the things that are kind of beyond human reach.”

What she wants instead is representation learning - models that build a map of the latent biological space underneath what we can measure. States like “young” or “healthy” are real biological conditions but not directly measurable - the same way human intelligence is real but no test captures it.

The measurements we do have (blood biomarkers, DNA methylation, gene expression etc) are, according to Levine, “just shadows” cast by those underlying states (“young”, “healthy”). This is Plato's cave applied to biology: we cannot see the form directly, only its shadows on the wall.

The goal is to infer the form from enough shadows seen from enough angles. To get there, the field needs three things:

First, a map. “You want to map all the possible states using as much information and as many modalities as you possibly can. So representation learning is basically mapping the latent states of cells, of molecules, of tissues, of organisms all the way up these scales,” Levine explained.

Then, diverse benchmarks that can test whether the map is any good, and training data drawn from many modalities, species and cell types - not five billion samples of the same thing.

And here Levine pulled up a slide referencing BAIO's special report from a few weeks back. Her quote of our post? “Better models need more coverage: more cell types, more perturbations, more patient cohorts, more experimental modalities.”

Our favorite slide from Morgan Levine’s presentation.

Beyond multi-modality and multi-species sampling, Levine said, there is real value in collecting what she called “black box data” - data we can't yet interpret. “We might not know what it means,” she said, but if there is real biological signal in there, AI systems can use it.

Levine ended with a call to action aimed at the audience.

First plea: stop collecting just one modality from a sample. The same sample can yield protein, RNA, methylation, and imaging data - capture multiple views, not just one.

Second: stop running every experiment on the same handful of cancer cell lines. “I know everyone loves” them, she said, but the field needs diversity in cell types.

Third: stop relying on the Black-6 mouse strain that has been biomedical research's default model for decades. “They're not diverse enough to actually learn biology. We're learning biology in that one specific strain.”

Beyond mice, she said, lies the tree of life itself: “Evolution left so much data for us to learn from that we're not even looking at.”

And finally, stop hoarding data to build moats. “People are generating data to train their data models, and they're keeping it private, proprietary, and building this data moat. None of us are going to get to the place where we need to get by creating these little silos.”

Why it matters: Her talk, as mentioned in the beginning, was titled “The road to longevity lies in the virtual worlds we need to build.” It was never explicitly stated what these virtual worlds are. But read through the cave metaphor that ran through the talk, one interpretation is that the virtual world is the latent biological reality - the thing casting the shadows - made visible through representation learning. And that foundation - representation learning over diverse, multi-modal biological data - is also the direction she is now in a position at Altos to drive. But right now we are in the cave, watching ghosts.

Did you know? Levine joined Altos Labs from Yale as a founding Principal Investigator. Her book True Age is a guide for readers wanting to measure - and lower - their own biological age, covering diet, sleep, exercise, intermittent fasting, and caloric restriction. Altos Labs is hiring.

NEWS
AI agents now design protein binders about as well as humans

Frontier AI agents now design real protein binders about as often as expert human teams. That is the headline result from a one-day competition hosted by a new platform called muni and run through the Adaptyv lab in Lausanne, the automated wet-lab whose API BAIO covered in Issue 13.

Nine human teams and six LLM agents - Claude Sonnet 4.6, GPT 5.2, Gemini 3.1 Pro, Grok 4.1 Fast, Qwen 3.5 Plus and GLM 5 - were each given the same brief: design a protein that binds TREM2, a receptor on the brain's immune cells (microglia) strongly linked to late-onset Alzheimer's risk. The agents worked alone, no human in the loop.

141 designs came in. Adaptyv synthesized and tested the top 100 for binding. 37 stuck. Humans hit on 25 of 65 attempts (38.5%); agents hit on 12 of 35 (34.3%). Essentially a tie, in other words.

Humans still made the tightest binders though. The top human design, from team MRAZS using a tool called Mosaic, bound about three times more tightly than the best agent design (a GPT 5.2 design built on a tool called PXDesign). Curiously, all six agents independently reached for PXDesign as their main weapon, even though half a dozen other design tools sat on the same menu.

Note the curve here: Adaptyv has run a sequence of these competitions, and the hit rate keeps climbing: 2.5% on EGFR in 2024, 13.2% on a rematch later that year, 9.6% on the Nipah virus in January, 37% on TREM2 this week.

Next up? muni and Adaptyv are wiring their systems together so an agent can submit designs, get binding data back, and redesign without anyone touching the DBTL loop.

Why it matters: AI agents and human experts now hit a real wet-lab target at similar rates. "We'd go as far as to claim binding is roughly solved", Adaptyv Bio notes on X. The harder questions - whether these molecules can actually be made into drugs, whether they trigger immune reactions, whether they survive long enough in the body to matter - were not tested here.

Did you know? All 141 designs and binding data are open on Proteinbase, including the raw measurement curves. Adaptyv is hiring.

NEWS
A specialized AI just found what may be a trigger for two autoimmune diseases

Credit: ChatGPT

Ankylosing spondylitis (a type of arthritis) and acute anterior uveitis (a painful eye inflammation) often appear together in the same patients, who suffer attacks by their own immune system on their own tissues. Researchers have long suspected the attacker is a specific kind of T-cell, but the body's own protein fragments that set those T-cells off have remained elusive.

A team led from Stanford and the University of Chicago thinks they have a strong candidate.

In a new Nature Biotechnology paper, the team trained AI models to predict what each rogue T-cell from a patient would attack. Then they pointed the models at a library of human-protein fragments and asked which ones the T-cells would lock onto. Two names kept coming up: fragments from proteins called PSG5 and PRPF3. PSG5 stood out for one reason - it sits, conveniently for the theory, in the exact part of the eye that gets inflamed in these patients.

To check, they drew blood from six patients and six healthy people and looked for T-cells primed to attack PSG5. The patients had a lot more of them than the controls. A flu-peptide control done at the same time showed no difference between groups, meaning the PSG5 elevation was specific to the disease, not a general immune overdrive.

The team's models were specialized protein language models, fine-tuned on real binding data from the lab. They beat both AlphaFold3 - currently the most famous AI model in biology - and tFold-TCR, a model purpose-built to predict TCR-peptide binding, at predicting which peptides actually activate T-cells. To be fair, AlphaFold3 was built to predict protein shapes, not T-cell triggers - but tFold-TCR was built specifically for the task and still came up short. The point is that smaller models with the right training data outperformed both a general giant and a specialist on a real task.

Why it matters: A Nature Biotechnology review (covered by us in issue 10) warned that biological AI often fails to beat simple baselines, and most models lack wet-lab validation. This one didn't just do wet-lab validation - it went all the way to patient blood. The recipe - generate the right data, train a focused model, test the prediction in actual people - is rare, but here it produced a candidate autoantigen that could rewrite how these diseases are understood.

Did you know? Everything in the paper is open: data, code, and crystal structures.

3 THINGS I LEARNED
“No virtual cell model actually uses AlphaFold”

Martin Borch Jensen.

“Ask not what AI can do for longevity. Ask what the longevity field can do for AI.” That's the title of an essay Martin Borch Jensen - founder of Norn Group and CSO at Gordian - published the other day, and also the case he made at Vitalist Bay, speaking right after Morgan Levine (see our story above).

I actually interviewed Borch Jensen separately at the conference. BAIO subscribers can look forward to that conversation in a few weeks.

The three takeaways below are based on his essay.

1. AI works when three things line up: compute, task-shaped data, and verifiable outcomes.

Language models could become lawyers and solve novel math because the training data already encoded the relevant reasoning, and each task came with a built-in feedback signal - humans correcting the chat, formal verifiers checking the math. Biology has neither. All of PubMed doesn't tell you how to treat Alzheimer's, because the literature is an incomplete and biased projection of the underlying biology. And the closest thing to a verification loop - a clinical trial - runs for years. Without task-shaped data and fast, cheap verification, AI cannot iterate.

2. Lower biology layers cannot verify answers at higher layers.

Borch Jensen lays out four layers: molecular (like protein folding), cellular (CRISPR experiments), physiological (heart failure, dementia, aging), and organismal (lifespan). Higher layers emerge from lower ones, but the mapping is many-to-many - the same molecular state in a cell can produce very different physiological outcomes depending on tissue, signaling, and timing. As he puts it: “a cell cannot have blood pressure.” If you want answers about a given layer, you have to measure that layer directly.

3. No frontier virtual cell model uses AlphaFold (according to Borch Jensen).

AlphaFold revolutionized molecular structure prediction - but every virtual cell model in development still relies on cell-level measurements like imaging or RNA sequencing, rather than building up from molecular predictions. Borch Jensen cites X-Cell (covered in BAIO Issue 9), the Xaira/Bo Wang model, which uses protein interaction data extensively but “notably relies on past experimental observations rather than AlphaFold predictions.” His read: if molecular predictions could be stacked into cell predictions, virtual cell modelers would use AlphaFold. None of them do.

Why it matters: This pairs pretty well with the Levine talk above. Both argue the bottleneck for AI × longevity is data, not compute - but they cut at different angles. Levine emphasizes diversity: more modalities, more species, more cell types. Borch Jensen emphasizes layer-specificity: physiological-layer data for physiological-layer questions, since lower layers can't verify higher-layer answers. His concrete prescription: allocate at least 5-10% of AI × bio funding to enabling work that generates task-shaped data at the layers AI needs - longitudinal cohorts, paired animal-to-human datasets, and new measurements on stored biological samples.

Did you know? Borch Jensen co-founded Gordian Biotechnology in 2018; the company pioneered pooled in vivo screening - the same layer-bridging technique he highlights in the essay as a way to link cellular and physiological data. Norn Group, which he founded in 2021, is now funding a new grant round explicitly aimed at the human data infrastructure AI needs to accelerate aging research - with a requirement that funded projects release their data open-source and formatted for AI ingestion. Gordian is hiring.

NEWS
Isomorphic Labs raises $2.1 billion as Hassabis calls the approach “fundamentally sound.”

Credit: Isomorphic

The $2 billion Series B BAIO covered as a rumor in issue 24 is now official - at $2.1 billion. Thrive Capital led. Alphabet and GV stayed in; new investors include MGX, Temasek, CapitalG, and the UK Sovereign AI Fund.

CEO Demis Hassabis said in the press release: “Now that we have shown our approach is fundamentally sound, our focus is on scaling our technology to its full potential.” President Max Jaderberg added: “Our drug design engine works, and it's giving us a repeatable way to design new medicines.”

Money goes to building out Isomorphic's drug design engine (IsoDDE), and pushing the pipeline toward clinic.

Why it matters: “Fundamentally sound” is the strongest internal claim Isomorphic has made about its model to date. Hassabis is not a hyperbolic communicator; this is a leader signaling that the technical risk is behind them and the company is now in scale-up mode.

Did you know? Isomorphic is hiring.

THE EDGE

scHilda is a new open-source framework for automated single-cell type annotation. It integrates a biological knowledge graph (23K genes, 3K pathways, 3K cell types) into LLM reasoning, outperforming CellTypeAgent and GPTCellType on most of 8 benchmark datasets. With the DeepSeek-V3.2 backbone, it costs about $0.001 per cell and runs in 1-3 minutes per dataset.