In partnership with

THE BRIEFING

Biological AI is crossing from computation into the practice of science. This issue captures that trend from two directions.

JURA Bio crosses into the physical practice. Their AI doesn't just design proteins - it encodes itself into DNA synthesis chemistry and manufactures 10 quadrillion designs in a single reaction. Then it screens 20.9 billion antibody interactions in three days and learns from the results. The model's outputs aren't files. They're molecules.

BioReason-Pro crosses into the intellectual practice. It doesn't label a protein's function, but reasons through the evidence step by step, the way a biologist would, well enough that human experts preferred its explanations over curated database entries 79% of the time.

A Nature Biotechnology review, however, asks how far this crossing has actually gone - and is honest that much of biological AI still falls short.

Let's dive in.

Want to get the most out of ChatGPT?

ChatGPT is a superpower if you know how to use it correctly.

Discover how HubSpot's guide to AI can elevate both your productivity and creativity to get more things done.

Learn to automate tasks, enhance decision-making, and foster innovation with the power of AI.

NEWS
An AI that explains why a protein does what it does

AlphaFold solved what a protein looks like. But knowing the shape doesn't automatically tell you what it does. For over 250 million known sequences, fewer than 0.1% have experimentally confirmed functions. Bo Wang's group at the University of Toronto and Vector Institute, with Hani Goodarzi's lab at Arc Institute, released BioReason-Pro - a multimodal reasoning LLM model based on Qwen3-4B-Thinking that predicts protein function and explains its reasoning step by step.

Paste in a sequence and the model walks through the evidence: which domains are present, what interactions they suggest, what function follows from the context. You can evaluate that logic and decide whether to follow up in the lab. It combines protein embeddings from ESM3 with a language model trained on 130,000 reasoning traces generated by GPT-5, then sharpened with reinforcement learning.

In blinded evaluation, 27 protein experts preferred the supervised version's annotations over curated UniProt database entries 79% of the time. In one case the model independently predicted a binding partner - and its attention focused on the exact contact residues confirmed by cryo-EM.

Why it matters: Most protein AI outputs labels. This one shows its work. Paper, code, models, web app, and predictions for 240,000-plus proteins are all open.

Did you know? BioReason-Pro builds on BioReason, which integrated DNA embeddings from Evo 2 (which we covered here) with an LLM for variant effect prediction and was published at NeurIPS 2025. Wang is also SVP of Biomedical AI at Xaira Therapeutics, the company behind X-Cell - the virtual cell model we covered in Issue 9. Code for BioReason-Pro is on GitHub.

NEWS
An AI model can now manufacture its own designs

What if an AI didn't just design proteins on a screen, but physically built them - encoding its outputs as chemical mixing ratios in a DNA synthesis reaction? That's what JURA Bio reports in Nature Biotechnology, with Harvard geneticist George Church as co-author. They call it variational synthesis and “manufacturing-aware architecture”. It manufactured roughly 10 quadrillion designed sequences.

DNA synthesis normally adds one specific base - A, T, G or C - at each position to build one specific sequence. JURA instead feeds AI-computed mixtures at each step - precise ratios of all four bases, different at every position - so each molecule ends up with a different designed sequence. The ratios come from a generative model trained on 300 million human antibodies. Standard equipment runs the instructions. The founder says doing this conventionally would cost a quadrillion dollars.

JURA then screened 209 million of these antibodies against 100 cancer targets simultaneously - 20.9 billion interactions in three days - and trained AI models that predict binding for targets never seen in training.

Why it matters: AI designs far more candidates than labs can build. Variational synthesis removes that bottleneck by turning manufacturing itself into the generative model.

Did you know? The variational synthesis code is open source on GitHub. JJURA also demonstrated the method on other targets, confirming it generalizes beyond antibodies.

NEWS
The field's own honest assessment of where AI x bio actually stands

A Nanobanana 2 attempt to capture the spirit of the review article.

How close is biological AI to actually working? A Nature Biotechnology review by seven of the field's most active researchers - Bo Wang, James Zou, Patrick Hsu, Eric Topol, Marinka Zitnik, and Pranav Rajpurkar among them - maps where models deliver and where they don't. Foundation models for single-cell data do not consistently beat simpler baselines on perturbation prediction. Benchmark results are often proof-of-concept demonstrations rather than standardized evaluations. Most models lack wet-lab validation.

But the honesty serves a purpose. The review defines Generalist Biological AI (GBAI) - unified systems that predict across DNA, RNA, protein, and cellular domains simultaneously - and charts a path from today's single-domain models toward cross-domain systems that could underpin virtual cells.

Why it matters: The authors are frank about the gap because they think the destination justifies the challenge to overcome it. GBAI, they argue, has the potential to unravel the complexities inherent to the language of life. This paper is more roadmap than eulogy.

Did you know? The author list reads like a BAIO recurring cast. Bo Wang (EchoJEPA, X-Cell and, above, BioReason-Pro), James Zou (Virtual Lab, Virtual Biotech, Eubiota), and Patrick Hsu (Evo 2) have all featured in previous issues. Eric Topol at Scripps is one of the most prominent public voices on AI in medicine.

NEWS
A Trillion Gene Atlas challenge to unlock steeper scaling laws for bio AI

The GBAI review above names data limitations as one of the field's biggest bottlenecks: most biological AI models train on variants of the same, relatively few, public databases.

Basecamp Research thinks the answer is diversity, not compute, and just launched the Trillion Gene Atlas to prove it. The initiative, announced at SXSW and GTC with Anthropic, NVIDIA, Ultima Genomics, and PacBio, aims to expand known genetic diversity 100-fold by sequencing data from over 100 million species worldwide.

The U.K.-based startup began with a 2019 Arctic expedition where two-thirds of samples were previously unrecorded species. It has since built a proprietary database it says is 10 times larger than all public repositories combined and trained foundation models called EDEN on that data.

The Atlas scales this further using ultra-high-throughput sequencing and NVIDIA's Parabricks (a genomics software suite) for processing the raw sequencing data. The partners expect to compress over 20 years of processing into less than two.

Why it matters: Basecamp says EDEN follows steeper scaling laws with diverse data than with more compute alone. “As biological datasets grow larger and richer, AI capabilities jump”, notes Basecamp, “opening the door to systems that can design new medicines across diseases and treatment types”.

Did you know? Fortune reports that Basecamp has, so far, paid royalties to 60 organizations across 21 countries based on how much each community's genomic data contributes to model outputs.

NEWS
Hibernating squirrels just pointed Lilly to a new obesity drug target

The 13-lined ground squirrel.

The 13-lined ground squirrel shifts its metabolism 235-fold in an hour, rebuilds muscle after six months of immobility, and reverses early fibrosis and damage to the heart and brain. Startup Fauna Bio thinks the genes behind these traits could point to drug targets current treatments can't reach. This week the company announced it has designated its first target under a $494 million collaboration with Eli Lilly, signed in 2023 - triggering a milestone payment.

Fauna's Convergence AI platform analyzes genomic data from over 450 mammalian species, including more than 60 hibernators, and cross-references it with human clinical and genetic datasets. The idea is not just to find interesting biology in animals but to identify targets that are conserved and druggable in humans.

The designated target - which the company hasn't named publicly - was identified by studying mechanisms of metabolism and energy expenditure in those above mentioned hibernating squirrels. According to Longevity Technology Lilly will now take the target forward into optimization and development.

Why it matters: It’s a different approach to AI drug discovery. Fauna starts with millions of years of evolution and works backward - finding mechanisms nature has validated across species, then checking if they translate to people. A milestone from Lilly suggests the approach is producing targets worth pursuing.

Did you know? Fauna Bio also has a NASA program studying hibernation-like torpor for protecting astronaut health on long-duration space missions. Fauna Bio is sort of hiring.

THE EDGE

OmicVerse shipped OmicClaw - describe a multi-omics analysis in plain English and the system runs it across 200-plus functions spanning single-cell, spatial transcriptomics, RNA velocity, and multiome workflows. Unlike raw LLM code generation, every step is checked against a registry before execution - the system can't call functions that don't exist, use wrong arguments, or skip prerequisites. Results are logged and reproducible. Install with pip, connect to Claude Code, or try the web platform. Code and paper on GitHub and bioRxiv.

ON OUR RADAR

Until next time,
Peter at BAIO

Keep Reading