In partnership with

THE BRIEFING

AI has been building computational representations of biology from the bottom up. Foundation models for proteins. Foundation models for cells. In this issue, two papers push that project in new directions.

☑️ Orthrus argues that RNA foundation models should learn from evolution rather than borrowing training objectives from language AI - and delivers state-of-the-art results to back it up.

☑️ APOLLO goes further, compressing 33 years of hospital records from 7.2 million patients into virtual patient representations that can predict disease onset, treatment response, and adverse events across 322 clinical tasks.

The trajectory is clear: from virtual molecules to virtual cells to virtual patients. But how well do these representations survive contact with reality? That question runs through the rest of the issue.

☑️ A review of AI antibiotic discovery maps the steep drop-off between computational hits and lab results.

☑️ Google's CoDaS extracts clinical biomarkers from smartwatch data, but is upfront about modest effect sizes.

☑️ And someone with no lab experience sequenced their own genome at home using a portable sequencer and an AI assistant - proving the tools are accessible, if not yet clinical-grade.

Let’s dive in.

Want to get the most out of ChatGPT?

ChatGPT is a superpower if you know how to use it correctly.

Discover how HubSpot's guide to AI can elevate both your productivity and creativity to get more things done.

Learn to automate tasks, enhance decision-making, and foster innovation with the power of AI.

NEWS
A foundation model creates virtual patients from 33 years of hospital records

A virtual patient. That's the output of APOLLO, a foundation model trained on 25 billion clinical records from 7.2 million patients at Mass General Brigham - spanning 33 years and every major category of hospital record - lab results, vital signs, prescriptions, clinical notes, diagnostic reports, pathology images, and more.

Rather than targeting a single disease, APOLLO learns a unified atlas of over 100,000 medical concepts across 12 specialties, compressing each patient's full care history into one computational representation. Faisal Mahmood's lab at Harvard and the Broad Institute tested it on 322 clinical tasks - predicting disease onset up to five years out, treatment response, adverse events - validated on 1.4 million held-out patients, significantly outperforming baseline models on the majority of tasks. The findings are presented in a preprint.

The goal is to “represent the entire patient,” Mahmood said at the AACR 2026 Meeting. “Once we have this representation, we can look at risks, predictions, similarity searching, statistical analysis, and other kinds of treatment response predictions.”

Why it matters: The AI x bio field is busy building computational representations of biology at the cellular level - virtual cells. APOLLO moves the abstraction up: from simulating cells to simulating entire patient journeys. The paper frames three applications if these representations generalize beyond a single hospital system: clinical trial matching, in silico trial simulation, and personalized treatment prediction. And because the system is searchable - a clinician can enter text or even a pathology slide as a query - it could match patients to clinical trials even when their records are incompletely coded.

Did you know? Modella AI, a startup spun out of Mahmood's lab, was acquired by AstraZeneca. The lab is currently recruiting PhD students and postdocs across Harvard and MIT programs.

NEWS
An RNA model that learns from evolution, not language

Many foundation models for genomics borrow their training approach from language AI - predict the missing token in a sequence. It can work, but biology already offers a more direct training signal. Orthrus, published in Nature Methods by Bo Wang's lab at the University of Toronto and the Vector Institute, takes a different approach. Built on the Mamba architecture - designed to handle long sequences efficiently, which matters for RNA - it uses a machine learning technique called contrastive learning. The model is trained to recognize that two RNA transcripts which biology says should be similar actually are similar.

The pairs come from two sources: splice isoforms - different versions of RNA produced when cells include or skip different sections of the same gene - and orthologous transcripts, meaning the same gene in different species, preserved by evolution because it does something important. The training data spans 10 model organisms and over 400 mammalian species via the Zoonomia Project.

The result: Orthrus outperforms most existing genomic foundation models on five mRNA property prediction tasks and reaches state-of-the-art on RNA half-life prediction with remarkably few labeled examples. Code and model are open source.

Why it matters: The paper's core argument is that the pre-training objective has to mirror the structure of the biology. Evolution and splicing already encode which RNA sequences are functionally related. The practical stakes: better RNA property prediction feeds directly into designing mRNA vaccines and predicting the effects of genetic perturbations.

Did you know? Two of Orthrus's three co-first authors - Philip Fradkin and Ian Shi - co-founded Blank Bio, a YC-backed startup building RNA foundation models for therapeutics. Blank Bio is hiring.

NEWS
Google built an AI scientist to find clinical signals hiding in your smartwatch data

Google Pixel Watch 4. Credit: Google

Wearable devices generate continuous streams of physiological data - heart rate, sleep patterns, activity, even screen time. Translating that into clinically meaningful biomarkers has been the hard part. CoDaS (Co-Data-Scientist), a multi-agent AI system from Google Research, Google DeepMind, and MIT, is a structured attempt to close that gap. Given a research question in plain language, CoDaS runs a six-phase discovery loop: it generates hypotheses, runs statistical analysis, then deploys an adversarial agent that stress-tests every candidate biomarker for confounding variables and data leakage before anything gets reported.

Across three cohorts totaling 9,279 participants, CoDaS identified 41 candidate biomarkers for mental health and 25 for metabolic outcomes. The standout finding: across two independent depression cohorts, the system independently converged on circadian instability - variability in when you fall asleep and how long you stay asleep - as a consistent signal. It also constructed a novel “night-to-day social media ratio” that correlated with depression severity. For metabolic health, it derived a cardiovascular fitness index from steps and resting heart rate that tracked insulin resistance.

The paper is upfront about effect sizes: predictive gains are modest. But the system's built-in error-checking earns its keep - in testing, it caught and rejected a finding that was statistically real but scientifically useless.

Why it matters: Consumer wearables sit on millions of wrists generating data that mostly goes unused for clinical purposes. CoDaS suggests that the bottleneck isn't the data - it's the analytical infrastructure to extract reliable signals from it. In a blinded evaluation, 15 experts ranked CoDaS outputs above those of Google's own AI Co-Scientist, and it was the only system whose generated manuscripts received non-rejection editorial decisions. The findings still need to be confirmed in new studies - the paper says as much - but the framework for getting from raw wearable data to candidate biomarkers is now on the table.

Did you know? Daniel McDuff's health AI group at Google Research is recruiting a student researcher for agentic evaluation in health - PhD students with consumer health experience are encouraged to apply.

NEWS
AI is hunting for antibiotics - is it working?

The discovery of new antibiotic classes has nearly stalled since the late 1980s. Drug-resistant infections are now associated with over 4 million deaths per year, and a Lancet analysis projects that number could surpass 8 million by 2050. A new review in Cell Biomaterials, co-authored by César de la Fuente-Nunez at the University of Pennsylvania, maps where AI stands in the effort to break the drought.

The approaches range from screening millions of existing compounds to designing entirely new molecules from scratch. De la Fuente's own lab has mined antimicrobial peptides from Neanderthal and Denisovan genomes, archaeal proteomes, global venom datasets, and the human gut microbiome - producing AMPSphere, a catalog of over 863,000 candidate antibiotic molecules identified from global metagenomic data.

But the review doesn't shy away from the hard numbers. In one generative pipeline, thousands of AI-designed candidates were narrowed to 58 synthesized molecules - only 6 showed antibacterial activity. In another study, 55 of 78 AI-designed peptides were active, but none worked against P. aeruginosa, a WHO priority pathogen. The pattern holds across approaches: high computational hit rates, but steep drop-offs in the lab.

Why it matters: The review identifies three gaps that need closing before AI-designed antibiotics reach patients: better and more standardized datasets, models that account for whether a molecule can actually be synthesized and survive in the body, and closed-loop workflows where AI and wet-lab experiments learn from each other iteratively. In other words - the same challenges that face AI x bio more broadly.

Did you know? The AMPSphere database is available for researchers who want to explore the 863,498 candidate antimicrobial peptides identified from global metagenomic data.

NEWS
Someone with no lab experience used AI to sequence their own genome at home

The wetlab, placed on a camping table. Credit: vibe-genomics.replit.app

An anonymous user with zero laboratory training bought an Oxford Nanopore MinION - a portable DNA sequencer - asked Claude to write the protocols, and sequenced their entire genome from a spare bedroom. The step-by-step guide, posted under the handle @banana_baeee and viewed over 280,000 times on X, documents the full workflow of their “vibe genomics” project: collecting cells from inside the cheek, extracting DNA using a consumer kit, preparing the sample, and running it through the sequencer for 72 hours on a camping table. Total cost: roughly $10,000.

To check whether the results were real, the author compared them against their own 23andMe data from a decade ago. The sequencer correctly identified 98.5% of the same genetic variants - rising to 99.25% on the calls the software was most confident about. The depth of sequencing is below what a clinical lab would accept, but it worked.

The physical DNA never left the house - no saliva kit mailed to a lab. But the author then sent their digital genome to Claude for analysis, trading one privacy concern for another.

It fits a pattern. In March, an Australian tech entrepreneur used ChatGPT, Grok and AlphaFold to help design a custom mRNA cancer vaccine for his dog, partnering with UNSW for sequencing and manufacturing - a story that drew both attention and skepticism from scientists. Separately, a caregiver with no medical background built an AI tool to manage his mother's stage 4 cancer treatment, telling Business Insider it caught errors in her care plan. None of these people had formal training in what they attempted. All of them used AI to bridge the gap.

Why it matters: The hardware for home genome sequencing is consumer-grade. The reagents ship to anyone with a business address. The bioinformatics runs on a Mac. What used to require an institutional lab and years of training can now be attempted by a motivated person with a credit card and an AI assistant. This is a single self-reported project, not a validated method - but the tools are only getting cheaper and the AI is only getting better.

Did you know? The Human Genome Project took 13 years, dozens of institutions, and roughly $2.7 billion to produce the first complete human genome sequence in 2003. Today, clinical-grade whole genome sequencing from a commercial provider costs under $500. This DIY project cost about $10,000 - but the point was to prove it could be done at home at all.

THE EDGE

Which AI protein design models actually work in the lab? Kosonocky et al.'s review in Current Opinion in Structural Biology consolidates wet-lab success rates across binders, antibodies, and enzymes in a set of tables that amount to a field-wide scorecard. Models like BindCraft, AlphaProteo, Chai-2, and RFdiffusion are compared target by target - with specific success rates, validation methods, and references.

ON OUR RADAR

Until next time,
Peter at BAIO

Keep Reading