THE BRIEFING
NVIDIA's annual GPU Technology Conference - GTC - is live right now in San Jose. It used to be a graphics card show. Then it became an AI conference. Increasingly it has also evolved into an important biotech launchpad.
This issue is a little longer than usual because of it. Four stories came directly from GTC: a framework that breaks the size ceiling on protein folding, a protein design model backed by what the authors call the largest wet-lab validation campaign in the field's history, Roche flexing 3,500 GPUs, and Xaira's long-awaited launch of its virtual cell model.
Apart from that we also have ScienceClaw: the most viral open-source project of 2026 (or maybe “in the history of humanity”) has got a scientific research layer from an MIT lab that wants autonomous agents to do real science.
Let's dive in.
AD
Your Docs Deserve Better Than You
Hate writing docs? Same.
Mintlify built something clever: swap "github.com" with "mintlify.com" in any public repo URL and get a fully structured, branded documentation site.
Under the hood, AI agents study your codebase before writing a single word. They scrape your README, pull brand colors, analyze your API surface, and build a structural plan first. The result? Docs that actually make sense, not the rambling, contradictory mess most AI generators spit out.
Parallel subagents then write each section simultaneously, slashing generation time nearly in half. A final validation sweep catches broken links and loose ends before you ever see it.
What used to take weeks of painful blank-page staring is now a few minutes of editing something that already exists.
Try it on any open-source project you love. You might be surprised how close to ready it already is.
NEWS FROM GTC
A virtual cell model predicts biology it has never seen

A still from a video Xaira published as part of the launch of X-Cell.
Xaira Therapeutics announced X-Cell, a 4.9 billion parameter virtual cell model that predicts what happens when you knock out a gene - including in cell types and experiments the model has never seen. The work is led by Bo Wang, who has recently argued that diffusion models are better suited to biological data than the autoregressive architectures most biology AI uses. X-Cell seems to fit well with that thesis.
It's trained on X-Atlas/Pisces - 25.6 million cells that were each experimentally manipulated rather than simply observed, across seven different cell types and conditions. Xaira says it’s “the largest genome-wide CRISPRi Perturb-seq dataset ever reported”.
On experiments the model never saw during training, X-Cell outperformed existing models by up to fivefold on a key accuracy metric. In one test, it never saw activated immune cells during training - then correctly predicted which gene knockouts would shut down T cell activation. Scaled to 4.9 billion parameters, it generalized to entirely unseen cell types, including primary human T cells from two donors.
The scaling result comes with a caveat the authors stress: on a smaller dataset, adding parameters beyond about 1.6 billion didn't help. The gains came from pairing the larger model with Pisces' 152,000 unique experimental conditions. Data diversity, not model size alone, is the current bottleneck.
In any case, the virtual cell race is heating up. PerturbGen (which we covered in Issue 7), Arc Institute's Virtual Cell Challenge, and now X-Cell are all competing to predict how cells respond to interventions.
Why it matters: We’ll let Xaira Therapeutics CEO, Marc Tessier-Lavigne, answer this one. From the press release: “The goal of building a virtual cell is to understand biology at a causal level, to be able to ask: if a cell is in a disease state, what does it take to bring it back to a healthy state?”
Did you know? Bo Wang joined Xaira in April 2025 saying he wanted to "build the first virtual cell in the world." He's also presenting at NVIDIA GTC this week on how “the architecture of the data matters as much as the architecture of the model”. Xaira is hiring.
NEWS FROM GTC
NVIDIA just broke the size ceiling on protein folding

Ground truth (yellow) vs. AI prediction (blue) for a 2,624-residue protein complex, generated in a single pass across 64 GPUs. Credit: Fold-CP paper, Earendil Labs
Structure prediction models like AlphaFold and Boltz can figure out how proteins are shaped - but only up to a point. The math behind them eats memory fast, and a single GPU tops out at a few thousand residues. That’s fine for individual proteins. It's not fine for the massive molecular machines that drug hunters need to understand - multi-protein complexes far too large for a single GPU to handle. Over 70% of protein complexes in CORUM - a standard database of known mammalian complexes - were simply too big to predict.
NVIDIA’s new framework, Fold-CP, fixes this by splitting the work across many GPUs without losing prediction quality. Applied to the open-source Boltz models and announced at GTC, it pushes the limit from around 2,000 residues on one GPU to over 30,000 on 64.
Drug discovery startup Rezo Therapeutics used it - on 16 GPUs - to fold 93% in CORUM, up from under 30%. NVIDIA’s own team folded a disease-relevant complex of 3,605 residues in one pass — and found a protein interaction that cropping-based methods would have missed.
Why it matters: The authors are wildly ambitious in their framing: the ”Fold-CP framework charts a concrete, computationally tractable path toward the holistic simulation of biological systems”.
Did you know? Fold-CP is open source and built on Boltz-2. It was co-developed with three biotech companies - Rezo Therapeutics, Proxima, and Earendil Labs, together with NVIDIA.
NEWS FROM GTC
Computational protein design got an industrial-scale stress test

NVIDIA reports a de novo binder design for carbohydrates. Credit: NVIDIA
NVIDIA released Proteina-Complexa at GTC - and backed it with what the authors say is the largest head-to-head comparison of protein binder design methods ever conducted. Over a million designed proteins were tested against 127 targets in a single wet lab campaign, pitting Proteina-Complexa against RFdiffusion, BindCraft, and BoltzGen under identical conditions.
Proteina-Complexa produced more validated hits than any competing method - and its self-generated sequences outperformed designs where a separate tool was used to fill in the amino acid sequence after the fact, which is standard practice in the field, according to the authors. That's a first at this scale.
Across individual campaigns: 63.5% of designs bound a cancer-relevant receptor, with the tightest binding in the picomolar range. Designed proteins blocked a muscle-wasting signal in cells without any experimental optimization. And the team reports the first computationally designed proteins that bind sugar molecules - a target class no previous method had cracked.
Why it matters: Many AI protein design papers test a handful of targets. This tested over a million candidates against 127 - and validated the results in the lab. And those sugar binders? The long-term goal, according to the authors, is making any donated organ compatible with any patient.
Did you know? The campaign required over 140,000 GPU hours. Proteina-Complexa also produced nanomolar binders to the Nipah virus attachment protein - a WHO priority pathogen with 40-75% fatality rates and no approved treatments. Proteina-Complexa is a cross-institutional collaboration led by NVIDIA with Manifold Bio, Viva Biotech, Novo Nordisk, Duke University, Cambridge University and LMU Munich.
NEWS FROM GTC
Roche says it now has more GPUs than any other pharma company

Roche deployed 2,176 new NVIDIA Blackwell GPUs across the US and Europe, bringing its total to over 3,500 - what the company says is the largest GPU footprint in the pharmaceutical industry. The deployment, announced at GTC, spans both on-premise and cloud infrastructure and covers drug discovery, diagnostics, and manufacturing.
The investment isn't theoretical. Nearly 90% of Genentech's eligible small-molecule programs already use AI. In one oncology program, a molecule was designed 25% faster. In another, a backup drug candidate was delivered in seven months instead of over two years. Roche is also building digital twins of manufacturing facilities using NVIDIA Omniverse, including a new GLP-1 production plant in North Carolina.
The timing is notable: Eli Lilly went live with its own AI supercomputer less than three weeks earlier. We covered Lilly's $1 billion NVIDIA co-innovation lab in Issue 3. The pharma GPU race is accelerating.
Why it matters: Roche, Lilly, and Recursion are all now running thousands of GPUs for drug discovery. The infrastructure commitments signal that big pharma sees AI-driven R&D not as a pilot program but as core operating capability.
Did you know? The collaboration builds on a 2023 partnership between NVIDIA and Genentech, Roche's subsidiary. Aviv Regev, head of Genentech Research and Early Development, said the compute expansion will let scientists "build more sophisticated predictive frontier models."
NEWS
A cancer test that costs thousands has got an AI shortcut
In December, Microsoft Research, Providence Health, and the University of Washington published GigaTIME in Cell. Microsoft CEO Satya Nadella highlighted it again on social media on March 15. So let’s take a look at it.
The most detailed picture of how immune cells interact with a tumor requires imaging that costs thousands per sample. GigaTIME skips the lab entirely. It takes a standard pathology slide - the kind produced for every cancer biopsy, costing $5-10 - and uses AI to predict what the expensive molecular imaging would show, across 21 different protein markers.
The team trained it on 40 million cells where both types of imaging existed, then applied it to 14,256 cancer patients across 51 hospitals. The result: about 300,000 predicted images spanning 24 cancer types, and over 1,200 new associations between protein activity and clinical outcomes. Independent validation on 10,200 patients from a separate database held up.
The authors are clear-eyed about limits: some proteins can't be reliably predicted from how cells look, and most patients in the study were from the western US. This is a research tool, not a diagnostic.
Why it matters: If it holds up, GigaTIME turns slides hospitals already have into information they couldn't previously extract - both for individual treatment decisions and for large-scale research into who responds to which therapy and why.
Did you know? GigaTIME builds on GigaPath, a digital pathology foundation model the same team published in Nature in 2024. The model and code are open source.
NEWS
OpenClaw went viral. Now autonomous AI agents are doing science with it.

OpenClaw - the open-source AI agent that has swept through Silicon Valley, China, and NVIDIA's GTC keynote in under three months - now has a scientific research layer. Markus Buehler's lab at MIT released ScienceClaw × Infinite, a framework where autonomous AI agents conduct scientific investigations without central coordination: selecting from 300+ composable tools, producing provenance-tracked results, and publishing findings to a shared platform where other agents and humans build on the work. A preprint describes the system and four demonstrations.
The biology case (so far): agents were given a peptide sequence that binds a cancer-relevant receptor and asked which amino acids could be swapped to improve it. With no central coordinator assigning methods, agents independently used three different approaches - 3D structural analysis, evolutionary sequence comparison, and an AI protein language model - and identified the same critical binding region. They also flagged which positions in the peptide are flexible enough to optimize. All computational though - no wet lab validation. The authors are explicit that results are hypothesis-generating, not predictions.
An open platform is live, with 139 registered agents posting across eight research communities including protein design, drug discovery, and biology.
Why it matters: OpenClaw has 322,000+ GitHub stars and its own NVIDIA integration. ScienceClaw is the bet that the same decentralized agent model can produce real science - not just automate tasks, but run investigations and build on each other's findings. Whether that produces rigorous discoveries or sophisticated noise is the open question. BAIO will keep you posted.
Did you know? Buehler is also CTO of Unreasonable Labs, which BAIO covered in Issue 7. The Infinite platform requires agents to pass capability-verification challenges before they can post in research communities. Code on GitHub.
THE EDGE
The OpenFold Consortium released the full training stack for OpenFold3 - the open-source alternative to AlphaFold3 for predicting how proteins interact with drugs, antibodies, and DNA. The update adds training datasets (via AWS), model weights, training and inference code, and evaluation scripts, all under Apache 2.0. Performance is said to be competitive with AlphaFold3 across most evaluated modalities. Available on GitHub and Hugging Face.
ON OUR RADAR
Until next time,
Peter at BAIO




