undefined - Faster Science, Better Drugs

Faster Science, Better Drugs

Can we make science as fast as software? In this episode, Erik Torenberg talks with Patrick Hsu (cofounder of Arc Institute) and a16z general partner Jorge Conde about Arc's β€œvirtual cells” moonshot, which uses foundation models to simulate biology and guide experiments. They discuss why research is slow, what an AlphaFold-style moment for cell biology could look like, and how AI might improve drug discovery. The conversation also covers hype versus substance in AI for biology, clinical bottlenecks, capital intensity, and how breakthroughs like GLP-1s show the path from science to major business and health impact.

β€’September 15, 2025β€’56:26

Table of Contents

0:32-7:56
8:02-15:59
16:05-23:57
24:05-31:58
32:05-39:55
40:02-47:58
48:04-55:14

πŸš€ What is Patrick Hsu's moonshot to make science faster?

Arc Institute's Vision for Accelerating Scientific Discovery

Patrick Hsu's moonshot centers on fundamentally transforming how scientific research operates by making it as fast and iterative as software development.

The Core Vision:

  1. Virtual Cells with Foundation Models - Create computational simulations of human biology that can predict experimental outcomes
  2. Speed of Neural Networks - Enable experiments to run at the speed of forward passes through AI models rather than waiting for physical lab processes
  3. Default Tool for Scientists - Build technology that becomes the go-to resource for experimentalists and computational biologists who just want reliable data

The Physical Reality Problem:

  • Real-World Constraints: Unlike AI research that iterates quickly on GPUs, biology requires moving actual atoms and liquids between tubes
  • Time-Intensive Processes: Growing cells, tissues, and animals happens in real time and cannot be accelerated through traditional means
  • Life-Changing Medicine Development: The ultimate goal is creating treatments that require physical manipulation of biological systems

Massive Parallelization Potential:

The promise lies in using machine learning to massively parallelize biological research, potentially allowing thousands of virtual experiments to run simultaneously while physical labs conduct validation studies.

Timestamp: [0:44-1:53]Youtube Icon

🐌 Why is scientific progress so frustratingly slow?

The Multifactorial Challenge Slowing Down Discovery

Scientific progress faces a complex web of interconnected obstacles that create what Patrick Hsu describes as a "weird Gordian knot" of systemic issues.

Incentive Structure Problems:

  1. Funding Limitations - Science funding systems don't adequately support the type of research needed
  2. Career Training Issues - How scientists are trained and incentivized for long-term career growth creates barriers
  3. Basic vs. Commercial Divide - Artificial separation between fundamental science and commercially viable research limits problem-solving scope

Interdisciplinary Collaboration Barriers:

  • Limited Expertise Range: Individual research groups or companies can typically excel at only 2-3 areas simultaneously
  • Example Combinations: Computational biology + genomics, or chemical biology + molecular glues
  • The Five-Domain Challenge: Nearly impossible for single entities to master neuroscience, immunology, machine learning, chemical biology, and genomics together

Physical and Structural Constraints:

University Limitations:

  • Geographic Distribution: Multiple disciplines exist across different campuses and buildings
  • Physical Distance: Literal separation reduces collaboration frequency between experts
  • Individual Incentives: Researchers need to publish their own papers and make their own discoveries, discouraging collaborative work

Publication Pressure:

Scientists are rewarded for individual achievements rather than collaborative breakthroughs, creating competition instead of cooperation.

Timestamp: [1:53-4:04]Youtube Icon

🏒 How does Arc Institute solve interdisciplinary collaboration?

Organizational Experiment in Scientific Innovation

Arc Institute represents a deliberate organizational experiment designed to overcome traditional barriers that slow scientific progress through strategic physical and structural design.

The Collision Frequency Strategy:

  1. Five Disciplines Under One Roof - Neuroscience, immunology, machine learning, chemical biology, and genomics in the same physical space
  2. Increased Interaction Opportunities - Higher frequency of spontaneous collaborations between experts from different fields
  3. Expanded Problem Space - Access to research questions that no single discipline could tackle independently

Flagship Project Structure:

Two Major Initiatives:

  • Alzheimer's Drug Target Discovery - Combining multiple disciplines to identify new therapeutic approaches
  • Virtual Cell Development - Creating computational models of human biology using foundation models

Removing Traditional Barriers:

Infrastructure Integration:

  • Shared Physical Space: All researchers work in the same building rather than distributed across campus
  • Collaborative Incentives: Reward structure encourages working on larger flagship projects requiring multiple expertise areas
  • Cross-Disciplinary Projects: Focus on problems that require rather than merely benefit from multiple disciplines

Beyond Just People and Infrastructure:

The models themselves are designed to literally accelerate science by enabling experiments to run at the speed of neural network forward passes, assuming the models become accurate and useful for real-world applications.

Timestamp: [2:52-4:54]Youtube Icon

πŸ€– Why has AI advanced faster in images than biology?

The Fundamental Complexity Gap Between Domains

The disparity in AI progress between visual/language domains and biology stems from fundamental differences in complexity and human intuition.

Technology vs. Biology Difficulty:

  1. Inherent Complexity: Biology is significantly more complex than natural language processing or image generation
  2. Native Understanding: Humans already know how to speak and interpret images, making evaluation intuitive
  3. Biological Illiteracy: We don't speak the language of biology - at best, we understand it "with an incredibly thick accent"

Evaluation and Iteration Challenges:

DNA Foundation Models:

  • Token Interpretation: Researchers can only sense the types of tokens being fed into models and what emerges
  • Non-Native Language: Unlike evaluating text or images, humans cannot natively assess DNA model outputs
  • Ground Truth Requirements: Must run actual lab experiments to validate computational predictions

Virtual Cell Models:

  • Fuzzy Outputs: Models generate unclear, difficult-to-interpret results
  • Lab-in-the-Loop Necessity: Requires physical experiments to test against experimental ground truth
  • Slower Iteration Cycles: Each validation round requires real-world biological processes

The Interpretation Problem:

The core challenge involves developing methods to interpret and understand the "weird fuzzy outputs" that biological models generate, which fundamentally slows down the iteration and improvement process compared to domains where humans can immediately assess quality.

Future Acceleration Path:

Increasing both the speed and dimensionality of experimental validation will be crucial for accelerating biological AI development to match progress in other domains.

Timestamp: [5:10-6:36]Youtube Icon

πŸ”¬ How can we model cells when we don't understand them completely?

The Unknown Components Challenge in Virtual Cell Development

Creating virtual cell models faces a fundamental paradox: how do you simulate something when you don't fully understand all its components and functions?

The Measurement Limitations:

  1. Invisible Components: Many biological processes and molecules cannot be directly observed or measured
  2. Unknown Elements: We're not certain we understand all components within cells and their interactions
  3. Spatial Resolution Gaps: Cannot measure metabolites and other key elements in high throughput with precise spatial resolution

The Natural Language Processing Parallel:

Historical Academic Tradition:

  • Structured Approach: Long academic tradition in NLP focused on understanding language structure
  • Controversial Breakthrough: It was "weird and unintuitive and intensely controversial" that feeding unstructured data into transformers would work
  • Unexpected Success: The approach succeeded despite not following traditional linguistic understanding

Biology's Similar Challenge:

  • Accuracy Questions: What does it mean to be an accurate biological simulator or virtual cell?
  • Unstructured Data Approach: Similar potential exists for feeding biological data into models without complete understanding
  • No Guarantees: Not claiming this approach will definitely work in biology, but precedent exists

Phased Capability Development:

Progressive Modeling Approach:

  1. Individual Cells - Start with single cell simulations
  2. Cell Pairs - Model interactions between two cells
  3. Tissue-Level - Expand to cells within tissue contexts
  4. Physiological Systems - Eventually model cells within intact animals

The Measurement Reality:

Despite incomplete knowledge, the approach involves building models that can evolve and improve as measurement capabilities advance, rather than waiting for complete biological understanding before beginning development.

Timestamp: [6:36-7:56]Youtube Icon

πŸ’Ž Summary from [0:32-7:56]

Essential Insights:

  1. Arc Institute's Moonshot - Creating virtual cells using foundation models to simulate human biology and accelerate scientific discovery at the speed of neural network computations
  2. Scientific Progress Barriers - Science is slow due to a complex web of incentive problems, interdisciplinary collaboration challenges, and physical constraints that create a "Gordian knot" of systemic issues
  3. Organizational Innovation - Arc brings five disciplines (neuroscience, immunology, machine learning, chemical biology, genomics) under one roof to increase collision frequency and enable larger collaborative projects

Actionable Insights:

  • AI Complexity Reality: Biology is fundamentally more complex than image/language AI because humans don't natively understand biological processes, requiring lab-in-the-loop validation that slows iteration cycles
  • Measurement Limitations: Virtual cell development proceeds despite incomplete biological knowledge, following the NLP precedent where unstructured data approaches succeeded against traditional academic expectations
  • Phased Development Strategy: Models will progress from individual cells to cell pairs to tissues to intact physiological systems as capabilities advance

Timestamp: [0:32-7:56]Youtube Icon

πŸ“š References from [0:32-7:56]

People Mentioned:

  • Patrick Hsu - Cofounder of Arc Institute discussing the organization's moonshot to create virtual cells and accelerate scientific discovery

Companies & Products:

  • Arc Institute - Interdisciplinary research organization bringing together neuroscience, immunology, machine learning, chemical biology, and genomics under one roof

Technologies & Tools:

  • Foundation Models - AI models being developed to simulate human biology and create virtual cells
  • DNA Foundation Models - Specialized AI models trained on genetic sequence data for biological predictions
  • Virtual Cell Models - Computational simulations designed to model cellular behavior and interactions
  • Transformers - Neural network architecture that revolutionized natural language processing through unstructured data approaches

Concepts & Frameworks:

  • Collision Frequency - Strategy of increasing spontaneous interactions between experts from different disciplines by co-locating them physically
  • Lab-in-the-Loop - Methodology requiring physical experiments to validate computational model predictions in biological research
  • Interdisciplinary Collaboration - Approach combining multiple scientific domains to tackle problems no single field could solve independently

Timestamp: [0:32-7:56]Youtube Icon

🧬 What are virtual cells and how do Arc Institute's models predict cell behavior?

Virtual Cell Technology & Perturbation Prediction

Arc Institute is developing virtual cells as computational models that can predict how cells will respond to different interventions, similar to how AlphaFold revolutionized protein structure prediction.

The Virtual Cell Vision:

  1. Cell State Manifold - Create a universal representation of all possible cell types and states (heart cells, blood cells, lung cells, etc.)
  2. Perturbation Prediction - Predict what interventions are needed to move cells from one state to another
  3. Drug Discovery Application - Use these predictions to identify new drug targets and therapeutic combinations

How It Works:

  • Current State: Cells exist in various conditions (inflamed, stressed, metabolically starved, cell cycle arrested)
  • Target State: Move cells to healthier, homeostatic states
  • Intervention Mapping: Models suggest specific sequences of perturbations needed for the transition

Practical Implementation:

  • Co-pilot for Biologists: Helps wet lab researchers decide which experiments to run
  • Experimental Validation: Models make predictions β†’ lab tests confirm β†’ improved models
  • Combinatorial Approach: Instead of single-target drugs, design purposeful multi-target interventions

The AlphaFold Comparison:

Arc aims to achieve 90% accuracy in predicting cellular perturbations, similar to AlphaFold's success with protein folding. Currently, the field is somewhere between "GPT-1 and GPT-2" capabilities - early stage but showing promising scaling potential.

Timestamp: [10:21-15:25]Youtube Icon

πŸ“Š How does Arc Institute use RNA data to understand protein function at scale?

Scaling Laws in Biological Data & Multi-Modal Integration

Arc Institute leverages massive RNA datasets as a scalable proxy for understanding protein-level biology, despite the inherent limitations of transcriptional data.

The RNA-Protein Challenge:

  • Current Limitation: RNA expression doesn't directly reflect protein function
  • Technology Gap: Proteomic measurement technologies aren't as scalable as transcriptomic technologies
  • Single Cell Resolution: RNA sequencing is more advanced than protein measurement at cellular resolution

The Scaling Solution:

  1. Mirror Representation: RNA serves as a lower-resolution mirror of protein activity
  2. Massive Data Scale: Large volumes of RNA data can reveal protein-level patterns through statistical power
  3. Signal Integration: Protein signaling eventually reflects in transcriptional states

Multi-Modal Data Strategy:

  • Layer Additional Information: Add protein-level data on top of RNA information where possible
  • Spatial Tokens: Incorporate spatial information for cellular context
  • Temporal Dynamics: Include time-series data for dynamic processes
  • Metabolic Information: Integrate metabolic data for comprehensive cellular understanding

Three-Tier Development Approach:

  1. Invention - Novel technologies that need to be created
  2. Engineering - Existing technologies that need optimization
  3. Scaling - Technologies ready for large-scale implementation

Current Focus:

Arc bets on what can scale today (single-cell transcriptional information) while developing future capabilities, requiring both research institute capabilities and engineering expertise.

Timestamp: [8:02-10:15]Youtube Icon

🎯 What makes Arc Institute's approach to AI-enabled drug discovery different?

From Accidental Polypharmacology to Purposeful Combinatorial Design

Arc Institute is moving beyond traditional single-target drug discovery toward computationally-designed combination therapies for complex diseases.

Traditional Drug Discovery Evolution:

  • Natural Products Era: Boiling leaves and extracting compounds
  • Antibody Development: Injecting proteins into animals and harvesting antibodies
  • Modern Computational: Zero-shot binder design and specific probe development

The Combinatorial Advantage:

  1. Complex Disease Reality: Most diseases don't have single causes
  2. Sequential Perturbations: Models suggest specific sequences - "these three changes first, then these two changes, then these six changes"
  3. Purposeful Design: Moving from accidental multi-target effects to intentional combinatorial interventions

Practical Implementation Strategy:

  • Lab-in-the-Loop: Experimental validation of model predictions
  • Co-pilot Approach: Helping biologists decide which 12 experiments to run in 12 different conditions
  • In Silico Target ID: Computational identification of new drug targets and compositions

The Vertical Integration Vision:

Arc aims to create a new type of AI-enabled pharmaceutical company that can:

  • Identify novel drug targets computationally
  • Design optimal drug combinations
  • Validate predictions experimentally
  • Scale successful interventions

Current Challenge:

Many AI pharma companies have compelling pitches that precede actual fundamental research breakthroughs. Arc focuses on building the underlying research capabilities first.

Timestamp: [11:45-14:46]Youtube Icon

πŸ’Ž Summary from [8:02-15:59]

Essential Insights:

  1. Virtual Cells as AlphaFold for Biology - Arc Institute is developing computational models that predict cellular responses to interventions with 90% accuracy, similar to AlphaFold's protein folding breakthrough
  2. RNA Data Scaling Strategy - Using massive transcriptional datasets as scalable proxies for protein function, leveraging statistical power to overcome individual measurement limitations
  3. Combinatorial Drug Design - Moving from accidental multi-target drugs to purposeful combinatorial interventions designed computationally for complex diseases

Actionable Insights:

  • Arc's three-tier approach (invention, engineering, scaling) provides a framework for biotechnology development prioritization
  • Lab-in-the-loop validation ensures AI models remain practically useful for wet lab biologists
  • The field is currently at "GPT-1 to GPT-2" stage for virtual cells, indicating early but promising capabilities with clear scaling potential

Timestamp: [8:02-15:59]Youtube Icon

πŸ“š References from [8:02-15:59]

People Mentioned:

  • Brian Hie - Collaborator at Arc Institute who developed EVO DNA foundation models

Companies & Products:

  • Arc Institute - Research institute developing virtual cell technology and AI-enabled drug discovery
  • AlphaFold - DeepMind's protein folding prediction system, used as benchmark for virtual cell aspirations

Technologies & Tools:

  • Single Cell Transcriptomics - Scalable RNA measurement technology at cellular resolution
  • Proteomics - Protein measurement technologies, currently less scalable than transcriptomics
  • EVO DNA Foundation Models - Arc Institute's genome generation models developed with Brian Hie

Concepts & Frameworks:

  • Virtual Cells - Computational models that simulate cellular behavior and predict responses to perturbations
  • Perturbation Prediction - Methodology for predicting cellular state changes in response to interventions
  • Polypharmacology - Traditional approach using drugs with multiple targets, often by accident
  • Cell State Manifold - Conceptual representation of all possible cellular types and states
  • Lab-in-the-Loop - Experimental validation approach combining model predictions with wet lab testing
  • GPT-1 to GPT-2 Stage - Current capability level of virtual cell technology compared to language model progression

Timestamp: [8:02-15:59]Youtube Icon

🧬 What would a GPT-3 moment look like for virtual cell biology?

Breakthrough Capabilities and Public Recognition

Patrick Hsu describes what a transformative moment in virtual cell biology would look like - similar to how GPT-3 altered public perception of AI capabilities and inspired a new generation of talent to rush into the field.

Key Biological Benchmarks:

  1. Yamanaka Factor Prediction - The model should predict that four specific factors can reprogram fibroblasts into stem-like states, essentially rediscovering the 2009 Nobel Prize-winning breakthrough
  2. Differentiation Factor Discovery - From stem cells, predict factors like neurogenin 2, ASCL1, and MyoD that turn cells into neurons or muscle cells
  3. Drug Mechanism Recapitulation - Predict FDA-approved drug mechanisms, like HER2 inhibition in breast cancer cell states

Advanced Predictive Capabilities:

  • Cancer Progression: Predict which clones will be more metastatic or resistant, leading to minimal residual disease
  • Therapeutic Responses: Accurately model how different cell states respond to various treatments
  • Developmental Biology: Recreate classic examples from textbooks through computational prediction

Evolution Beyond Current Models:

Current models focus on quantitative metrics like mean absolute error over differential gene expression - essentially ML benchmarks. The breakthrough moment will involve sophisticated biological evaluations that could be explained to "an old professor who has never touched a terminal in their life."

Timestamp: [16:42-19:16]Youtube Icon

πŸ“– Are biology textbooks actually ground truth for AI models?

The Complexity Behind Simplified Knowledge

When discussing using textbooks as training data for virtual cell models, Patrick Hsu reveals an important nuance about biological knowledge representation.

Textbook Limitations:

  • Compressed Information: Textbooks represent simplified, two-dimensional diagrams of complex biological systems
  • Classic Oversimplification: Cell signaling diagrams showing "A signals to B, which inhibits C" miss the multidimensional reality
  • Reliable but Incomplete: They represent the corpus of reliable knowledge but contain countless exceptions

The Discovery Process:

  • Exception Finding: Part of scientific discovery involves finding new exceptions to established rules
  • Knowledge Evolution: Current understanding is constantly being refined and expanded
  • Model Training Implications: AI models trained on textbooks will inherit both the knowledge and the limitations

Practical Applications:

Even without perfect virtual cell models, current AI systems like ChatGPT or Claude can already provide informed opinions on complex topics like receptor tyrosine kinase signaling, demonstrating the value of existing biological knowledge as training data.

Timestamp: [19:22-19:57]Youtube Icon

🎯 Why focus on virtual cells instead of modeling entire human bodies?

Building Complexity from Fundamental Units

Patrick Hsu addresses criticism about the "virtual cells" terminology and explains the strategic approach to biological modeling.

Terminology and Scope:

  • "Virtual Cells" vs "Digital Twins": While some find "virtual cells" too media-friendly, it's actually more scoped and rigorous than modeling digital twins or avatars at higher abstraction levels
  • Clear Goal Definition: The terminology describes the ambition - eventually predicting drug toxicity, aging, and complex biological responses

Strategic Approach:

  1. Start with Fundamentals: Focus on individual cells as the fundamental unit of biological computation
  2. Layer Complexity Gradually: Build understanding systematically rather than attempting full-body modeling immediately
  3. Proven Methodology: Similar to early AI development that started with basic tasks like language translation before achieving today's ambitious scope

Long-term Vision:

  • Environmental Predictions: Model how liver cells become cirrhotic when repeatedly challenged with ethanol
  • Chemical Perturbations: Predict responses to various chemical and environmental challenges
  • Scalable Foundation: Use cell-level understanding to build toward more complex biological systems

Practical Rationale:

Why worry about modeling entire bodies when we can't yet model individual cells effectively? The approach mirrors successful AI development - start with manageable, verifiable components and build toward superintelligence over time.

Timestamp: [20:02-22:08]Youtube Icon

πŸ’Ό How will AI innovations transform biotech business models?

From Software Sales to R&D Budget Competition

The discussion reveals how biotech startups are evolving their business strategies as AI capabilities mature in the life sciences sector.

Business Model Evolution:

  1. Initial Approach: Biotech startups tried selling software to pharma companies
  2. Reality Check: Discovered they were competing for small SaaS budgets rather than substantial R&D investments
  3. Strategic Pivot: Now positioning AI agents to compete for R&D budgets and potentially replace headcount

Current Industry Narrative:

  • Biological Agents: Companies claim their AI tools will compete for R&D budgets similar to AI agents across other verticals
  • Headcount Replacement: Positioning AI as a substitute for traditional research personnel
  • Success Dependency: Effectiveness depends on whether these tools meaningfully improve drug development

Virtual Cells Value Proposition:

  • Dual Purpose: Provides both fundamental mechanistic insights for discovery and industrial utility
  • Industrial Applications: Could significantly improve drug development processes if successful
  • Long-term Potential: Addresses core industry challenges beyond just software efficiency

Industry Challenge Context:

With 90% of drugs failing in clinical trials, the fundamental problems are:

  • Wrong Targets: Targeting incorrect biological pathways
  • Ineffective Compounds: Drug compositions that don't achieve desired effects
  • Unclear Attribution: Difficulty determining which factor causes each failure

Even with 90% accurate virtual cells, the industry will still need to navigate these complex challenges over time.

Timestamp: [22:14-23:57]Youtube Icon

πŸ’Ž Summary from [16:05-23:57]

Essential Insights:

  1. GPT-3 Moment for Biology - A breakthrough will involve predicting Nobel Prize-winning discoveries like Yamanaka factors and FDA drug mechanisms, moving beyond current ML benchmarks to biological evaluations
  2. Textbook Limitations - Biology textbooks provide compressed, simplified representations of complex systems, serving as reliable but incomplete ground truth for AI training
  3. Strategic Cell Focus - Starting with virtual cells as fundamental biological units is more rigorous than attempting full-body modeling, following proven AI development patterns

Actionable Insights:

  • Virtual cell models should be evaluated on canonical biological discoveries rather than just quantitative metrics
  • Business models in biotech AI are shifting from SaaS sales to competing for R&D budgets with potential headcount replacement
  • The 90% clinical trial failure rate stems from wrong targets and ineffective compounds, requiring systematic approaches to address both issues

Timestamp: [16:05-23:57]Youtube Icon

πŸ“š References from [16:05-23:57]

People Mentioned:

  • Shinya Yamanaka - Nobel Prize winner (2009) for discovering induced pluripotent stem cell factors that reprogram fibroblasts into stem-like states

Companies & Products:

  • ChatGPT - AI system capable of providing informed opinions on complex biological topics like receptor tyrosine kinase signaling
  • Claude - AI assistant that can discuss biological mechanisms and cellular processes

Technologies & Tools:

  • iPSC (Induced Pluripotent Stem Cells) - Stem cells artificially derived from non-pluripotent cells, mentioned as model input for virtual cell predictions
  • HER2 Inhibition - Targeted therapy approach for breast cancer treatment, used as example of drug mechanism prediction

Concepts & Frameworks:

  • Yamanaka Factors - Four transcription factors (Oct4, Sox2, Klf4, c-Myc) that can reprogram differentiated cells into pluripotent stem cells
  • Neurogenin 2, ASCL1, MyoD - Differentiation factors that direct stem cells toward specific cell types (neurons, muscle cells)
  • Receptor Tyrosine Kinase Signaling - Cell signaling mechanism involving protein phosphorylation, used as example of complex biological processes
  • Digital Twins/Digital Avatars - Higher-level biological modeling concepts that virtual cells aim to be more rigorous than
  • Minimal Residual Disease - Small numbers of cancer cells that remain after treatment and can lead to relapse

Timestamp: [16:05-23:57]Youtube Icon

🎯 What are the main challenges slowing down drug discovery today?

Complex Biology and Research Bottlenecks

The Russian Nesting Doll Problem:

  1. Understanding complexity - Biology involves multiple layers of interconnected systems
  2. Perturbation challenges - Difficulty in precisely targeting specific biological processes
  3. Safety considerations - Ensuring treatments don't cause unintended harmful effects

Current Drug Development Limitations:

  • Tissue-specific targeting - Need to target GPCRs in heart tissue only, not other tissues
  • Novel chemical biology - Require new approaches for tissue or cell-type specific drug delivery
  • Limited drug matter - Current pharmaceutical tools can't achieve the precision needed

Remarkable Progress Despite Challenges:

  • Single cell genomics evolution - From 20-40 cells in early 2010s papers to generating a billion perturbed single cells at Arc Institute
  • Technology acceleration - Massive scale improvements following Moore's law principles
  • Multiple breakthrough areas - Human genetics, CRISPR gene editing, and other revolutionary tools

Timestamp: [24:05-25:27]Youtube Icon

🚧 Why do 90% of drugs fail in clinical trials?

The Two Primary Failure Modes

Root Causes of Drug Failure:

  1. Wrong target selection - Going after the incorrect biological mechanism
  2. Wrong drug design - Making the wrong therapeutic to address the right target
  3. Poor batting average - Current hit rates are extremely low across the industry

The Clinical Trial Bottleneck:

  • Necessary but slow process - Must prove safety and efficacy before human use
  • Risk mitigation requirements - Extensive de-risking needed before clinical trials
  • Manufacturing challenges - Must be able to produce drugs at scale

Potential Solutions:

  • Better target identification - AI and computational models to improve selection
  • Enhanced drug design - More precise therapeutic development
  • Improved hit rates - Technologies that increase success probability from discovery phase

Timestamp: [25:56-26:56]Youtube Icon

⏱️ What makes clinical trials so difficult to accelerate?

Inherent Timeline Constraints

Natural Timeline Requirements:

  • Cancer survival studies - Must demonstrate actual survival benefits over time
  • Longevity drugs - By definition require lifetime-length trials
  • Safety monitoring - Extended observation periods for adverse effects

Current Improvement Strategies:

  1. Enrollment optimization - Faster patient recruitment methods
  2. Better trial design - Technology-enhanced clinical trial structures
  3. Shorter study periods - Where scientifically appropriate

Regulatory and Safety Framework:

  • Multi-stakeholder coordination - Companies, scientists, and regulatory agencies
  • Safety-first approach - Primary focus on safe and effective treatments
  • Necessary bottlenecks - Some delays are essential for patient protection

Timestamp: [27:35-28:11]Youtube Icon

πŸ’° How can the biotech industry overcome capital intensity challenges?

Three-Pronged Solution Strategy

Key Industry Fixes:

  1. Reduce capital intensity - Lower costs through better technology and higher success rates
  2. Compress timelines - Accelerate early discovery while maintaining safety standards
  3. Increase effect size - Develop better drugs with more obvious, faster results

Current Capital Challenges:

  • High upfront costs - Similar to AI model training expenses
  • Poor success-to-investment ratio - Most companies don't see valuation step-ups despite progress
  • Early investor burden - High capital intensity with delayed returns

Investment Impact:

  • Value creation timing - Success not reflected in valuations at appropriate milestones
  • Early-stage risk - Investors bear capital intensity without proportional rewards
  • Industry comparison - Unlike other sectors where early investment shows clear value inflection

Timestamp: [28:11-30:08]Youtube Icon

πŸ“ˆ What does the GLP-1 success story teach us about drug development?

Trillion-Dollar Market Impact Lessons

Unprecedented Value Creation:

  • Market cap addition - Over $1 trillion added to Eli Lilly and Novo Nordisk combined
  • Industry comparison - More than the total market cap of all biotech companies started over the last 40 years
  • Large patient population impact - Demonstrates value of addressing widespread conditions

Industry Risk Management Problems:

  • Conservative target selection - 10% clinical trial success rate leads to "circling the wagons"
  • Well-established mechanisms - Focus on proven biology with small patient populations
  • Low expected value - Safe bets often yield limited returns

Cultural Transformation:

  • Increased ambition - Both investors and drug developers raising their sights
  • Large population focus - Shift toward addressing conditions affecting millions
  • Positive trend momentum - Industry demonstration of massive value creation potential

Timestamp: [30:13-31:58]Youtube Icon

πŸ’Ž Summary from [24:05-31:58]

Essential Insights:

  1. Biology's complexity challenge - Drug discovery faces a "Russian nesting doll" of interconnected biological systems requiring novel approaches for tissue-specific targeting
  2. Clinical trial bottlenecks - 90% of drugs fail due to wrong targets or wrong drug design, with inherent timeline constraints that can't be easily accelerated
  3. Capital intensity crisis - High upfront costs similar to AI training, but without proportional valuation rewards for early investors

Actionable Insights:

  • Technology scaling opportunity - Arc Institute's progression from 20-40 cells to billion-scale perturbation studies shows Moore's law potential in biology
  • Industry transformation needed - Three-pronged approach: reduce capital intensity, compress timelines, and increase drug effect sizes
  • GLP-1 paradigm shift - Trillion-dollar value creation demonstrates the power of targeting large patient populations versus safe, small-market approaches

Timestamp: [24:05-31:58]Youtube Icon

πŸ“š References from [24:05-31:58]

People Mentioned:

  • Patrick Hsu - Co-founder of Arc Institute, former PhD researcher at Broad Institute during development of single cell genomics and CRISPR

Companies & Products:

  • Arc Institute - Research organization generating billion-scale perturbed single cells for biological research
  • Broad Institute - Research institution where Patrick Hsu completed PhD during heyday of single cell genomics development
  • Eli Lilly - Pharmaceutical company that gained significant market cap from GLP-1 drug development
  • Novo Nordisk - Danish pharmaceutical company, major developer of GLP-1 treatments with significant market cap gains

Technologies & Tools:

  • Single cell genomics - Technology for analyzing individual cells, evolved from 20-40 cells in early 2010s to billion-scale studies
  • CRISPR gene editing - Revolutionary gene editing technology developed during Patrick's PhD era
  • GLP-1 drugs - Breakthrough diabetes and weight loss medications that created unprecedented pharmaceutical value

Concepts & Frameworks:

  • Russian nesting doll complexity - Metaphor for biology's interconnected layers of understanding, perturbation, and safety challenges
  • Moore's Law in biology - Exponential scaling from small cell studies to billion-cell perturbation experiments
  • GPCR targeting - G-protein coupled receptor targeting with tissue-specific precision requirements

Timestamp: [24:05-31:58]Youtube Icon

πŸ’Š What makes GLP-1 drugs like Ozempic so valuable to society?

Breakthrough Drug Success Story

GLP-1 drugs represent a remarkable achievement in pharmaceutical innovation, demonstrating how tackling endemic social problems creates extraordinary value for both society and companies.

Key Success Factors:

  1. Four-decade development timeline - These weren't overnight successes but required sustained investment and research
  2. Endemic problem solving - Successfully addressed diabetes management and obesity, major societal health challenges
  3. Justified value transfer - Companies like Eli Lilly earned significant returns by solving very challenging problems for society

Impact Beyond Science:

  • Diabetes management breakthrough - Cracked a previously difficult-to-manage condition
  • Obesity treatment advancement - Provided new solutions for widespread health issues
  • Social problem resolution - Addressed challenges that extend far beyond just scientific curiosity

Industry Lessons:

  • High-risk, high-reward approach - Moving beyond low-hanging fruit to tackle ambitious indications
  • Large population impact - Targeting diseases that affect significant numbers of patients
  • Value justification principle - The juice needs to be worth the squeeze for major pharmaceutical investments

Timestamp: [32:05-32:54]Youtube Icon

🧬 How are genetic medicines tackling previously impossible diseases?

Revolutionary Treatment Approaches

Genetic medicines represent a paradigm shift in addressing diseases that were previously untreatable, going after some of the hardest problems in medicine through DNA editing and genetic interventions.

Breakthrough Capabilities:

  1. Previously impossible treatments - Addressing conditions that literally couldn't be treated before genetic editing
  2. DNA-level interventions - Direct modification of genetic material to treat root causes
  3. Hardest problem targeting - Taking on diseases that conventional medicine couldn't touch

Industry Transformation:

  • Inspiring innovation - Demonstrating what's possible when tackling ambitious medical challenges
  • Capital formation requirements - Need for substantial investment to support these high-risk, high-reward approaches
  • Current funding challenges - Difficulty securing necessary capital due to industry-wide issues

Strategic Implications:

  • Moving beyond incremental improvements - Shifting from safe, low-risk targets to transformative treatments
  • Long-term vision required - Understanding that breakthrough medicines require sustained commitment
  • Fundamental industry elements - Capital formation and risk tolerance must align to support innovation

Timestamp: [32:48-33:18]Youtube Icon

πŸ”¬ What technological breakthrough will transform drug discovery in 15 years?

Future of Pharmaceutical Innovation

The next major breakthrough in drug discovery will likely emerge from combining improved target identification with advanced medicine design capabilities across multiple therapeutic modalities.

Multi-Modal Drug Design Revolution:

  1. Enhanced small molecules - Getting better at designing smarter molecules that function in new ways
  2. Advanced biologics - Improved protein design with help from tools like AlphaFold for protein folding understanding
  3. Complex modalities - Better design of gene therapies and gene editors for previously untreatable conditions

Integrated Approach Benefits:

  • Better target understanding - Improved ability to identify what biological targets to pursue
  • Virtual cell models - Using computational models to understand what diseases to tackle
  • Large effect sizes - Developing drugs with significant impact on difficult diseases affecting many patients

Societal Impact Potential:

Disease Categories Being Addressed:

  • Metabolic disorders - Obesity and cardiometabolic diseases
  • Neurodegenerative diseases - Promising developments in previously intractable conditions
  • Cancer treatment - Transforming cancers from death sentences to chronic conditions
  • Widespread conditions - Tackling diseases that affect all of society

Industry Value Creation:

  • Extraordinary societal value - Industry positioned to be highly valued by society and markets
  • Delivery requirement - Success depends on actually delivering on these promises
  • Cumulative progress - Value will accrete over time as difficult diseases are tackled one by one

Timestamp: [33:30-35:47]Youtube Icon

βš—οΈ Why is physical testing still the biggest bottleneck in drug development?

The Reality of Drug Development Constraints

Even with AI models capable of designing trillions of potential drug compounds in silico, the physical testing process remains the critical limiting factor in pharmaceutical development.

The Testing Pipeline Challenge:

  1. Physical manufacturing requirement - AI-designed compounds must still be physically created
  2. Animal testing necessity - Molecules must be tested in mice, then larger animals
  3. Human trials imperative - Final validation requires testing in people, which cannot be accelerated significantly

Regulatory and Cultural Bottlenecks:

Engineering vs. Legal Mindsets:

  • China as engineering state - Political leaders with engineering degrees focused on building infrastructure
  • US as legal-focused nation - 10 of first 13 presidents practiced law; all Democratic presidential/VP candidates from 1980-2020 attended law school
  • Regulatory implications - Legal-minded approach creates bottlenecks in FDA and regulatory processes

Current Industry Adaptations:

  • Overseas Phase 1 trials - Running early trials internationally to build data packages
  • Domestic Phase 2 focus - Bringing efficacy trials back to US market
  • Insufficient solutions - Current workarounds don't fully address the fundamental bottleneck

The Compression Challenge:

  • Time-intensive process - "Mice, then mutts, then monkeys, then man" sequence is difficult to compress
  • Journey optimization - Making sure the long development path leads to successful outcomes as often as possible
  • Industry desperate need - Finding ways to increase success rates given the unavoidable time investment

Timestamp: [35:55-38:08]Youtube Icon

πŸ€– Where is AI overhyped versus showing real promise in drug discovery?

AI Reality Check in Pharmaceutical Development

The AI landscape in drug discovery shows clear distinctions between areas of genuine progress, promising developments, and overhyped applications that haven't delivered on their promises.

Areas of Overhype:

Toxicity Prediction Models:

  • The promise - AI models that can predict whether a molecule will be toxic
  • The reality - Limited success in accurately predicting toxicity from molecular structure alone
  • Why it's hyped - Oversimplified approach to complex biological interactions

Multimodal Biological Models:

  • Vague terminology - "Whatever that means" indicates unclear definitions and applications
  • Multiple layer complexity - Attempting to integrate molecular, spatial, and other biological layers
  • Unclear value proposition - Lack of concrete evidence for practical applications

Areas of Real Progress (Heft):

Protein Design and Binding:

  • Concrete applications - Actual improvements in designing proteins and predicting binding
  • Measurable outcomes - Clear metrics for success in protein engineering
  • Building on AlphaFold - Leveraging proven protein folding breakthroughs

Pathology and Radiology AI:

  • Automating specialist work - Successfully replacing or augmenting pathologist and radiologist analysis
  • Powerful use cases - Clear value proposition in medical imaging and tissue analysis
  • Proven applications - Don't require training complex biology foundation models

Strategic Approach:

  • Avoid unnecessary complexity - Many effective applications don't need "weird biology foundation models"
  • Focus on proven methods - Building on established AI techniques rather than creating entirely new approaches
  • Clear value demonstration - Successful AI applications show obvious improvements over existing methods

Timestamp: [38:27-39:55]Youtube Icon

πŸ’Ž Summary from [32:05-39:55]

Essential Insights:

  1. GLP-1 success model - Four-decade development timeline demonstrates how tackling endemic social problems creates extraordinary value for both companies and society
  2. Genetic medicine breakthrough - DNA editing enables treatment of previously impossible diseases, representing the most ambitious and inspiring medical innovations
  3. Multi-modal future - Next breakthrough will combine better target identification with advanced design across small molecules, biologics, and gene therapies

Actionable Insights:

  • Physical testing remains the critical bottleneck even with AI-designed compounds - focus on optimizing the "mice to man" pipeline
  • Regulatory and cultural differences between engineering-focused and legal-focused approaches create development bottlenecks
  • AI shows real promise in protein design and medical imaging, but toxicity prediction and multimodal biological models remain overhyped

Timestamp: [32:05-39:55]Youtube Icon

πŸ“š References from [32:05-39:55]

People Mentioned:

  • Dan Wang - Author who released book "Breakneck" about US-China differences in engineering vs. legal approaches to problem-solving

Companies & Products:

  • Eli Lilly - Pharmaceutical company that developed successful GLP-1 drugs, demonstrating value creation through solving endemic social problems
  • FDA - US regulatory agency representing legal-minded bottlenecks in drug development process

Books & Publications:

  • Breakneck - Dan Wang's book discussing US-China differences in engineering versus legal approaches to markets and problem-solving

Technologies & Tools:

  • AlphaFold - AI system that helps understand protein folding, enabling better biologics and protein design
  • GLP-1 drugs - Four-decade development success story demonstrating breakthrough pharmaceutical innovation
  • Virtual cell models - Computational models used to understand biological targets for drug development

Concepts & Frameworks:

  • Toxicity Prediction Models - Overhyped AI application attempting to predict molecular toxicity
  • Multimodal Biological Models - Hyped concept integrating multiple biological data layers
  • Pathology AI - Proven application automating pathologist and radiologist work
  • Gene Therapies and Gene Editors - Complex therapeutic modalities targeting previously untreatable genetic diseases

Timestamp: [32:05-39:55]Youtube Icon

πŸ€– Why hasn't AI successfully created new drugs yet?

AI Integration in Drug Discovery

The pharmaceutical industry faces a unique challenge where everyone claims their drug is "the first AI-designed molecule," but true AI integration is still emerging. AI is becoming a native part of the entire drug development stack, similar to how we now naturally use the internet and phones in all aspects of work.

Current AI Applications in Drug Development:

  1. Computational Docking - AI can dock designed small molecules to every protein in the proteome to predict off-target binding
  2. Binding Optimization - Tune binding selectivity and affinity to predict safety and efficacy
  3. Safety Prediction - Identify potential adverse reactions before costly testing phases

The Multi-Factor Challenge:

  • Design Phase - Creating the molecular structure
  • Manufacturing - Actually making the compound
  • Testing - Laboratory and clinical validation (takes hours, days, months, years)
  • Regulatory Approval - Safety and efficacy validation process

The fundamental bottleneck remains the feedback loop - testing in the lab takes real time that cannot be compressed, which is why Arc Institute focuses on virtual cell models as their initial approach to integrate these different pieces more efficiently.

Timestamp: [40:08-41:59]Youtube Icon

πŸš€ What does Dario Amodei predict about AI accelerating scientific discovery?

The Parallelization Theory

Dario Amodei's essay "Machines of Loving Grace" presents a bold vision where AI could prevent many infectious diseases and potentially double human lifespans within the next decade. His core insight centers on the statistical independence of scientific discoveries.

Key Predictions:

  1. Disease Prevention - Many infectious diseases could be eliminated
  2. Lifespan Extension - Potential doubling of human lifespans
  3. Timeline - These breakthroughs could happen within the next decade

The Independence Principle:

  • Statistical Independence - Important scientific discoveries are largely independent of each other
  • Massive Parallelization - If discoveries are independent, we could run millions or billions of discovery agents simultaneously
  • Computational Compression - Turn scientific discovery into a computation problem rather than a time-dependent process

Practical Implementation:

  • Virtual Cell Models - Can simulate biological processes at scale
  • Molecular Design Models - Create and test compounds computationally
  • Integrated Systems - Layer different AI models to predict cellular responses and drug interactions
  • Reliable Sequencing - If these steps can be traversed reliably and in sequence, significant timeline compression becomes possible

Timestamp: [42:05-43:51]Youtube Icon

🧬 What happens if virtual cell models are missing crucial biological data?

The Incomplete Data Challenge

Building effective virtual cell models requires feeding them comprehensive biological data - gene expression, DNA sequences, protein interactions, and countless other factors. However, there's an almost certain reality that we're missing many of the most important elements in biology.

Current Measurement Limitations:

  • Two Primary Methods - Biology can only be studied at high throughput through imaging and sequencing
  • Missing Elements - Many important biological processes aren't captured by these methods at scale
  • Unknown Unknowns - We likely haven't discovered key biological mechanisms that affect cellular behavior

The RNA Layer Strategy:

Arc Institute focuses on RNA as a mirror for other layers of biology - using RNA expression patterns as a proxy for broader cellular states and processes that might be harder to measure directly.

Two Model Approaches:

  1. Mechanistic Models - Explain the physical "why" and "how" of biological processes
  2. Meteorological Models - Predict outcomes without explaining underlying mechanisms

The Weather Prediction Analogy:

Just as AI can predict if it will rain next Tuesday without explaining the geological or physical reasons, a virtual cell model might accurately predict cellular responses without revealing the complete mechanistic understanding. AlphaFold exemplifies this approach - it predicts protein folding accurately without explaining exactly why proteins fold that way.

Timestamp: [43:57-45:46]Youtube Icon

🎯 Where is Jorge Conde focusing his AI investments beyond biotech?

Three Transformative Technology Areas

Jorge Conde's investment philosophy centers on improving the human experience within our lifetime, focusing on technologies that will fundamentally change the world we leave to our children.

Primary Investment Focus Areas:

1. Synthetic Biology

  • GLP-1 medications - Breakthrough treatments for diabetes and weight management
  • Sleep improvement technologies and therapeutics
  • Longevity research - Extending healthy human lifespan
  • These represent tangible improvements to human health and quality of life

2. Brain-Computer Interfaces

  • Breakthrough potential - Expected major advances over the coming decades
  • Direct neural interaction - Technologies that interface directly with the human brain
  • Transformative applications - Could revolutionize how humans interact with technology

3. Robotics (Industrial and Consumer)

  • Physical labor scaling - Robots that can perform and scale physical work
  • Industrial applications - Manufacturing and production automation
  • Consumer robotics - Personal assistance and household automation
  • Labor multiplication - Extending human physical capabilities

Investment Philosophy:

  • Medium-term success scenarios - Even moderate success in these areas could change the world
  • Techno-optimist vision - Addressing different types of scarcity through technology
  • Execution timing - The challenge isn't generating futuristic ideas, but executing them within practical timeframes (5-8 years)
  • Academic vs. Commercial - Many important academic discoveries remain "long before their time" for practical implementation

Timestamp: [45:52-47:58]Youtube Icon

πŸ’Ž Summary from [40:02-47:58]

Essential Insights:

  1. AI Drug Development Reality - While everyone claims to have "AI-designed" drugs, true AI integration is still emerging as a native part of the pharmaceutical stack, with testing feedback loops remaining the primary bottleneck
  2. Scientific Discovery Parallelization - Dario Amodei's vision suggests that if scientific discoveries are statistically independent, we could run millions of discovery processes simultaneously, compressing timelines dramatically
  3. Predictive vs. Mechanistic Models - Virtual cell models may succeed like weather prediction or AlphaFold - accurately predicting outcomes without fully explaining underlying mechanisms

Actionable Insights:

  • AI's current impact in drug discovery focuses on computational docking, binding optimization, and safety prediction rather than complete drug design
  • The RNA layer serves as a practical mirror for other biological processes that are harder to measure at scale
  • Investment opportunities exist in synthetic biology, brain-computer interfaces, and robotics as transformative technologies for human experience
  • Success requires executing futuristic concepts within practical 5-8 year timeframes rather than just generating long-term ideas

Timestamp: [40:02-47:58]Youtube Icon

πŸ“š References from [40:02-47:58]

People Mentioned:

  • Dario Amodei - CEO of Anthropic, author of "Machines of Loving Grace" essay predicting AI acceleration of scientific discovery and potential doubling of human lifespans

Publications:

  • Machines of Loving Grace - Dario Amodei's essay predicting prevention of infectious diseases and lifespan doubling through AI-accelerated scientific discovery

Technologies & Tools:

  • AlphaFold - DeepMind's AI system for protein structure prediction, used as example of predictive models that work without explaining underlying mechanisms
  • GLP-1 medications - Breakthrough treatments for diabetes and weight management, cited as example of synthetic biology success

Concepts & Frameworks:

  • Virtual Cell Models - Arc Institute's approach to integrating AI across drug discovery pipeline
  • Computational Docking - AI technique for predicting how molecules bind to proteins
  • Statistical Independence of Discoveries - Amodei's theory that scientific breakthroughs can be parallelized
  • Mechanistic vs. Meteorological Models - Two approaches to AI prediction - explaining "why" versus just predicting outcomes
  • RNA Layer as Biological Mirror - Using RNA expression patterns as proxy for broader cellular states

Timestamp: [40:02-47:58]Youtube Icon

🎯 What Skills Do Successful Deep Tech Founders Need According to a16z?

Essential Founder Capabilities

Jorge Conde explains that successful deep tech ventures require founders to excel across three critical domains, similar to an "RPG dice roll" where people start at different base levels:

The Three Core Domains:

  1. Technical Innovation - Deep expertise in the underlying technology and scientific principles
  2. Product Intuition - Understanding how to translate complex technology into usable solutions
  3. Commercial Thinking - Ability to build sustainable business models and go-to-market strategies

Common Founder Profiles:

  • Technical Genius: Incredibly strong technically but lacks commercial instincts
  • Natural Salesperson: Excellent at selling but may lack strong product sense
  • Product Visionary: Great product intuition but needs technical or commercial support

Why This Balance Matters:

The combination of these capabilities, properly funded at the right time, creates opportunities that "literally wouldn't happen" otherwise. This unique convergence enables teams to tackle fundamental problems using new technologies in truly differentiated ways.

Timestamp: [48:04-49:52]Youtube Icon

πŸš€ What Deep Tech Companies Is a16z Backing for the Future?

Jorge Conde's Investment Portfolio

Jorge Conde shares examples of companies he's excited about backing that represent "things that must happen in the world":

Current Investment Categories:

  1. Longevity Companies - NewLimit and other ventures focused on extending human healthspan
  2. Brain-Computer Interface (BCI) - Nudge and companies developing direct neural interfaces
  3. Robotics - The Bot Company and other automation-focused startups

Investment Philosophy:

  • Focus on technologies that "must happen" and "should happen" in the world
  • Look for the right people at the right time to execute on these visions
  • Compare the process to assembling a "fellowship of the ring" - finding the perfect team for an epic mission

Timing and Capital Allocation:

The key is identifying when to allocate capital to bring together the right combination of technical innovation, product sense, and commercial thinking to make breakthrough ideas possible.

Timestamp: [49:32-49:58]Youtube Icon

πŸ€– How Will AI Agents Transform the Services Economy?

Real Productivity vs. Software Productivity

Jorge Conde explains why AI agents represent a fundamental shift from previous technology waves:

Key Differentiator:

  • Traditional SaaS: Improved software-based productivity
  • AI Agents: Replace actual human work and real productivity

Current State and Trajectory:

  1. Today: Agents have many errors but are improving rapidly
  2. Near-term: Computer use agents will likely trail coding agents by about one year
  3. Future Path: Agents will progress from handling minutes of work without error β†’ hours β†’ days

Economic Impact:

  • Target Market: Most of the economy is services spend, not software spend
  • Opportunity: Agents can attack the entire services economy across legal, business operations, medicine, and healthcare
  • Product Evolution: Completely different product shapes will emerge as agent capabilities expand

This represents where "real heft" will emerge because it addresses the fundamental structure of the modern economy.

Timestamp: [50:17-51:22]Youtube Icon

πŸ”¬ What's Overhyped vs. Undervalued in AI Research Today?

The Architecture Problem and Research Opportunities

Jorge Conde breaks down where AI hype meets reality:

What's Overhyped:

  • Model Capabilities: Tremendous hype around current transformer architecture
  • Architecture Limitations: Current systems date back to 2017 transformer architecture
  • Overdue Innovation: Deep learning historically sees major breakthroughs every 8 years, making 2025 a critical inflection point

What's Undervalued:

  1. Academic Research Gold Mine: 2009-2015 "golden age" papers with under 30 citations
  2. Scaling Opportunity: Ideas that didn't work at 100M-650M parameters may succeed at 1B-70B scale
  3. Compute Cost Decline: Marginal cost reductions enable testing previously impractical ideas

Emerging Opportunities:

  • New Super Intelligence Labs: Beyond established foundation model companies
  • Research vs. Applied Focus: Established companies becoming applied AI businesses focused on enterprise products and revenue
  • Evolutionary Approaches: Companies like Sakana AI (founded by "Attention Is All You Need" author) exploring model merging and evolutionary selection

The future lies in discovering new architectures and learning methods beyond current RL limitations.

Timestamp: [51:22-53:57]Youtube Icon

πŸ† What Is Arc Institute's Virtual Cell Challenge?

Creating Biology's AlphaFold Moment

Patrick Hsu announces Arc Institute's major initiative to accelerate virtual cell development:

Competition Structure:

  • Prize Pool: $100,000 in total prizes
  • Sponsors: NVIDIA, 10X Genomics, Ultima, and other industry leaders
  • Format: Open competition accessible to anyone
  • Website: virtualcellchallenge.org

Competition Focus:

  • Primary Goal: Train perturbation prediction models
  • Assessment: Open and transparent evaluation of model capabilities
  • Timeline: Annual competitions to track progress over time
  • Target: Achieve a "ChatGPT moment" for biology

Inspiration and Vision:

Just as AlphaFold emerged from the CASP (Critical Assessment of Structure Prediction) protein folding competition, this challenge aims to catalyze breakthrough virtual cell technologies.

Call to Action:

Arc Institute welcomes participation from:

  • BioML Experts: Specialists in biological machine learning
  • Engineers: From any domain with relevant technical skills
  • Anyone Interested: Open invitation to contribute to this scientific advancement

Patrick emphasizes his ultimate goal: "I just want this thing to exist in the world" - whether Arc Institute creates it or someone else does.

Timestamp: [54:07-55:07]Youtube Icon

πŸ’Ž Summary from [48:04-55:14]

Essential Insights:

  1. Founder Success Formula - Deep tech breakthroughs require founders who combine technical innovation, product intuition, and commercial thinking in the right proportions at the right time
  2. AI Agent Revolution - Unlike traditional SaaS, AI agents will transform the services economy by replacing real human productivity, progressing from minutes to hours to days of autonomous work
  3. Research Architecture Gap - Current AI hype focuses on model capabilities, but the real opportunity lies in developing new architectures beyond 2017's transformer model, especially by scaling previously unexplored academic ideas

Actionable Insights:

  • Deep tech investors should seek teams that balance technical depth with product and commercial acumen
  • The services economy represents a massive opportunity for AI agents as they become more reliable and capable
  • Researchers should revisit 2009-2015 academic papers with low citations and test them at modern scale parameters
  • The Virtual Cell Challenge offers a concrete way for engineers and bioML experts to contribute to breakthrough biological modeling

Timestamp: [48:04-55:14]Youtube Icon

πŸ“š References from [48:04-55:14]

People Mentioned:

  • Llion Jones - Co-author of "Attention Is All You Need" and founder of Sakana AI, mentioned for innovative work on model merging and evolutionary selection

Companies & Products:

  • NewLimit - Longevity company backed by Jorge Conde focused on extending human healthspan
  • Nudge - Brain-computer interface company in Jorge Conde's investment portfolio
  • The Bot Company - Robotics startup backed by a16z for automation solutions
  • Sakana AI - AI research company founded by transformer architecture pioneer, focusing on model merging and evolutionary approaches
  • NVIDIA - Sponsor of Arc Institute's Virtual Cell Challenge
  • 10X Genomics - Genomics technology company sponsoring the Virtual Cell Challenge
  • Ultima Genomics - DNA sequencing company supporting Arc Institute's competition

Technologies & Tools:

  • virtualcellchallenge.org - Arc Institute's open competition platform for training perturbation prediction models with $100,000 in prizes

Concepts & Frameworks:

  • CASP Competition - Critical Assessment of Structure Prediction, the protein folding competition that led to AlphaFold's breakthrough
  • Transformer Architecture - The 2017 "Attention Is All You Need" model architecture that current AI systems are built on
  • Perturbation Prediction Models - Machine learning models that predict how biological systems respond to interventions
  • Model Merging - Evolutionary approach to combining different AI models for improved performance

Timestamp: [48:04-55:14]Youtube Icon