Third Rock x Agios

Studying investment decisions

Axial partners with great founders and inventors. We invest in early-stage life sciences companies such as Appia Bio, Seranova Bio, Delix Therapeutics, Simcha Therapeutics, among others often when they are no more than an idea. We are fanatical about helping the rare inventor who is compelled to build their own enduring business. If you or someone you know has a great idea or company in life sciences, Axial would be excited to get to know you and possibly invest in your vision and company . We are excited to be in business with you - email us at   info@axialvc.com

Third Rock Ventures is one of the pioneers for the VC-operator model that has dominated biotechnology over the last decade. Founded and led by Mark Levin, one could argue Third Rock is “Mark Levin Institutionalized.” On a side note, I met Mark in college and he hands down has the best shirts and shoes, and also the messiest desk, of anyone in biotech. Agios Pharmaceuticals, founded in 2007, was a very significant investment for Third Rock early on in the firm’s existence. Agios was founded by 3 great inventors, Lewis Cantley, Craig Thompson, and Tak Mak and initially led by David Schenkein (after Third Rock invested). In 2008, Third Rock led Agios’ $33M Series A  alongside ARCH and Flagship. The company was set up to drug cancer metabolism and grew on the tailwinds of a massive scientific discovery by the company’s founders a few years later. This case study works to reverse engineer the investment decision. Mark Levin was the founder of Millennium Pharmaceuticals and a lot of Third Rock’s early success was executing an “aircraft carrier” investing model. Seeing what Millennium was trying to be in the 1990s, Agios among others were actually achieving a lot of the vision in the 2010s. Agios ended up with several approved drugs and also serves as a case study for one of the best drug development partnerships ever.

Left to right: Cantley, Thompson, Mak

Key reasons Third Rock invested in Agios:

  1. Millennium used an aircraft carrier investment model around Millennium to see that Agios could execute a more efficient version of the business model around new biology in cancer metabolism. This set up Agios to garner a lucrative partnership with Celgene and get an approved drug within ~9 years.

  2. In terms of the science, Agios took a diverse approach early on then honed in one IDHs to get a first mover advantage.

  3. For the business model, Agios’ ability to get global rights on Tibsovo back from Celgene and the first pick on partnerships set them up to efficiently get a wholly-owned, approved asset. An important lesson is that for any partnership, it is important for the upstart to get the first pick on asset exclusivity. This is an evolution of the Millennium model that scaled its platform through narrow deals. Agios adds another layer by setting the requirement to get first pick in deals to maximize the odds of fully owning an approved drug.

Agios was started with the premise of drugging aberrant metabolism of the cancer cell. The company was originally formed from discoveries in the Cantley Lab, which discovered metabolic enzymes that drove several cancers. At the start, Agios pursued a wide range of programs drugging cancer metabolism through several mechanisms: glycolysis, fatty acid metabolism (FAM), autophagy, among others. Within a few years after founding, Agios pivoted its focus to isocitrate dehydrogenase (IDH) led by Michael Su who pushed forward the initial drug discovery and development programs. This new biology was published in 2010 and moved Agios to center the company’s pipeline around the breakthrough discovery for the role of IDH1/2 in cancer:

  1. The Common Feature of Leukemia-Associated IDH1 and IDH2 Mutations Is a Neomorphic Enzyme Activity Converting α-Ketoglutarate to 2-Hydroxyglutarate

  2. Cancer-associated IDH mutations: biomarker and therapeutic opportunities

IDH1/2 are enzymes involved in metabolism and were found to have specific mutations in a wide range of cancers. This discovery, along with the courage to quickly pivot, provided Agios an advantage to develop first-in-class medicine for mutant IDH enzymes well before others. This discovery helped Agios close an iconic deal with Celgene the same year as the publications and led to 2 drug approvals: Idhifa (partnered with Celgene) targeting IDH2 for AML and Tibsovo (internally held by Agios) targeting IDH1 for AML.

At the Series A, Agios was funded based on the potential for the company to develop first-in-class medicines at the nexus of oncology and metabolism:

  • Kevin Starr, Co-Founder of Third Rock Ventures and the first CEO of Agios before David Schenkein got on board: “In biology, it is truly a rare and unique instance when two major fields of research converge to create a completely new understanding of a deadly disease, offering a unprecedented value opportunity and the potential to create a novel class of drugs. We are very excited to have three outstanding founders who have an unparalleled knowledge base in this new field of biology that holds great promise for making a difference in cancer patient’s lives.”

  • Cantley: “While I have spent a significant part of my career studying signaling cascades in cancer, this newly discovered intersection of cancer and metabolism represents a new, untapped Achilles’ Heel of cancer cells that can block nutrients, thus effectively starving them of the fuel they need to grow and survive.”

  • Thompson: “What’s really exciting is that it is now becoming more clear that most oncogenes and tumor suppressor genes have evolved to regulate cancer metabolism, which is an opportunity to translate a century of cancer and metabolic science into a future of powerful cancer therapies.”

  • Mak: “We have long known that the survival functions and mechanisms of cancer cells overlap with other critical cellular functions, such as those of the immune system – leading to immune therapies such as cancer vaccines – but this new field linking metabolism to the growth and survival of cancer, has demonstrated potential to target and treat cancer in a completely new and fundamental way.”

Third Rock executed on an aircraft carrier model of investing. Thinking through what Millennium enabled led the firm to fund Agios among others. Sequoia Capital has used this strategy very well throughout its history too. For example, their investment in Apple led Sequoia to invest in the ecosystem around the company from startups like Tandon and Priam to Dyson and Printronix. Similarly, Third Rock’s founders had built up Millennium and invested in the ecosystem around it from Agios and MyoKardia to Foundation and a lot more. So the approach of making investments around iconic companies is centered around asking what does the company enable? Often they create new markets and/or problems to solve. They are also fountains of talent that build and lead new businesses. So the key thing is to intimately study the world’s great companies and truly understand why they were successful and where they could have improved. This is probably the easiest way to generate new startup ideas. New companies and problems to solve can be uncovered by asking what AbCellera enables? What does 10X Genomics enable? Guardant Health? Twist Bioscience?

Agios built a platform at the convergence of cancer metabolism and genomics. This was partly inspired by Millennium. Agios actually struck a partnership with Foundation Medicine in 2013 to use genomic profiling to identify patients more likely to respond to IDH1/2 inhibitors. From Agios’ S-1, one of the company’s core capabilities was “mining of genomic data emerging from the public cancer genome sequencing efforts” and using their “state of the art genomics and bioinformatics capabilities to identify metabolic enzymes that are mutated or amplified in tumors.” 

So Agios played its part to achieve at least part of Millennium’s vision to use genomics to transform drug development: “Agios is focused on discovering and developing novel investigational medicines to treat genetically defined diseases through scientific leadership in the field of cellular metabolism. All Agios programs focus on genetically defined patient populations, leveraging our knowledge of metabolism, biology and genomics.” Agios built out a platform merging genomics, proteomics, and metabolomics that set them up to make several breakthroughs in cancer metabolism. The platform allowed high-throughput metabolomic profiling to find enzymes involved in tumor growth and combined genomics to find specific mutations in these proteins with structural biology to find new drug targets.

In 2008, the company conducted a proteomic screen using SILAC and LC-MS for phosphotyrosine binding proteins in a cancer cell line. This work led to the Cantley Lab publishing a seminal paper linking PKM2, an enzyme involved in metabolism, and cancer cell growth. Then in 2009, led by Michael Su an incredible drug discoverer, the company used its platform to discover an entire new biology around isocitrate dehydrogenases (IDH) and cancer. The Nature paper established that IDH1 has oncogenic activity. Across 3 mutations in arginine 132 of IDH1, the variants gain the ability to reduce α-ketoglutarate to R(−)-2-hydroxyglutarate leading to ~100x higher concentrations of R(−)-2-hydroxyglutarate in brain cancer cell lines. This mechanism probably played an important role in Agios’ success in the clinic - inhibiting IDH indirectly alters cancer metabolism whereas contemporaries of Agios such as Calithera Biosciences who  targeted cancer metabolism directly had toxicity problems or low efficacy. And combined with the observation that over 70% of gliomas/glioblastomas and 20% of AML patients have mutant IDH1s, this research made a compelling case for IDH1 as a drug cancer in oncology. This massive scientific breakthrough set up Agios to build a unique business model and get several drugs approved years later. 

The same year in 2009, David Schenkein, who was an SVP at Millennium, came on board as CEO. Along with Third Rock, he brought his network and know-how from Millennium to transform the basic science within Agios into a leading drug development platform. A year later in 2010, Agios entered into a transformative and iconic partnership with Celgene to develop new therapeutics targeting cancer metabolism. Agios continued to publish breakthrough cancer metabolism research and moved into and through clinical development at a furious pace. In 2012, the company discovered inhibitors for mutant IDH1, validating them in vivo. Actually, both of Agios’ two approved drugs, Idhifa and Tibsovo, came from the first 3 screening runs and were  then converted from hits into drugs led by Janeta Popovici-Muller. Then in 2013, Agios started their first clinical trial testing enasidenib (AG-221) in patients with advanced hematologic malignancies with an IDH2 mutant. This set up the company to go public in 2013 (S-1). From a fundamental discovery in 2009 to starting a clinical trial in 2013, Agios is a case study on how to transform science into a drug and validate a platform.

In 2010, Agios struck up a partnership with Celgene on the tailwind of the former’s discovery of IDH1’s role in cancer. This is one of the most iconic deals in biotechnology ever. Agios used a new area of biology to design a deal with Celgene that favored the smaller company and picked the right partner who at the time was playing a pivotal role in the cancer drug development ecosystem. A lot of business development in biotech is equal parts world-class science and timing the market. In the late 2000s and early 2010s, Celgene was one of the most active partners in oncology:

The 2010 deal between Agios and Celgene was structured where Agios received an ~$120M upfront payment and a little over $8M in an equity investment from Celgene. From Kevin Starr: “From a size standpoint, this is the largest partnership we have among any of our portfolio companies by a wide margin.” Celgene received an exclusivity period until April 2014 with an option to extend this another 2 years on drugs coming out of Agios’ platform with Agios getting royalties and development milestones for these products in the US. In the 2010 deal, Agios “retained the option for exclusive rights to develop and commercialize AG-120 (what would become Tibsovo) in the United States.” The partnership was expanded in 2016:

  • Rob Hershberg, CSO of Celgene: “This emerging discipline of metabolic immuno-oncology has great potential to provide novel insights and targets for cancer immunotherapy in solid and hematologic malignancies.”

  • David Schenkein: “This strategic alliance will allow Agios to quickly expand our existing research platform into a third core area while leveraging Celgene’s capabilities and broad portfolio of immuno-oncology assets.”

Celgene paided Agios an $200M upfront payment with Agios getting first pick on all 50/50 programs. In this extension, Celgene gave Agios global rights to AG-120 for access to Agio’s platform in the then emerging field of immuno-oncology. This was a massive break for Agios. The overall partnership was mostly Celgene funding the development of Agios’ pipeline. The gift of AG-120 gave Agios an efficient path to get a wholly-owned, FDA-approved drug. Throughout this entire process, John Evans played a pivotal role in these partnerships.

Agios built a business model centered around the gift model pioneered by Millennium and the 50/50 deal led by Regeneron. In the former, the idea is to let your partners give you gifts. For Agios, it was Celgene giving them first pick on the 50/50 assets and the option for exclusive rights to what became Tibsovo. But for the gift model to work, a startup has to design partnerships based on a new biological hypothesis not just products. Millennium built a series of valuable partnerships that improved over time with their genomics platform as the field was just beginning to grow and narrowing their deals to grow their partnership based rapidly. For Agios, the Celgene deal was established based on new science in cancer metabolism. Beyond generating a new hypothesis, the hard part of the gift model is getting narrow exclusivity on the deals to retain as much IP as possible.

Agios used the Celgene deal to grow efficiently and develop their own wholly-owned internal product. In 2017, Idhifa (enasidenib) was approved by the FDA for r/r AML patients with IDH2 mutations. Then in 2018, Tibsovo (ivosidenib), wholly-owned by Agios, was approved in r/r AML with IDH1 mutations. Tibsovo had the potential to build a pipeline-in-a-pill. Agios did legendary work to go from basic research to the clinic and to commercial. The last part is rarefied air for any biotech company.

The Millennium, and Agios, dream hasn’t been achieved yet. There are a lot more companies to be built on top of the aircraft carrier. Especially as sequencing tools become more precise going down to single cells and resolving spatial relationships combined with new tools to measure the proteome, glycome, lipidome, metabolome, and a lot more there is still work to be done. Earlier generation companies like Millennium built a platform to find new targets/pathways and connect them to a disease. Successive generations riff on Millennium in some way by using more precise tools and data to go beyond connection and go deep into disease-specific biology. Agios and next-generation platform biotech companies are building better maps of disease. As the atlas of biology becomes more comprehensive,  the ability to develop new medicines to take into account diversity will improve response rate and lead to potential cures. Particularly for oncology, an issue is that most treatments generate partial responses or have relapses due to the heterogeneity of cancer. Single-cell tools and more can first measure this diversity and inform the design of new drugs to take into account cancer heterogeneity.

Agios pioneered drugging cancer metabolism. Another company will lead the way to pioneer drugging immune cell metabolism. Metabolism within the tumor microenvironments (TME) not only impacts cancer cell growth but also the activity of immune cells within the TME. Immune and cancer cells have converging metabolic pathways putting them in competition for similar resources. This creates a potential opportunity to design drugs to target metabolism to help immune cells kill cancer cells. Moreover to make things a bit more difficult, metabolic profiles are diverse across each TME creating another level of complexity to take into account for drug development. This creates a massive need for better metabolomic tools to maximize the potential of immunometabolism to help patients.

In 2021, Agios sold off their cancer drug pipeline for $2B ($1.8B cash) with royalties to Servier. Now Agios is focusing on the clinical development of mitapivat to activate pyruvate kinase R (PKR) to treat pyruvate kinase (PK) deficiency. Agios had some commercial hiccups that slowed the company down and was the difference between the company being a >$10B business versus a ~$2B one where it’s at. On a side note, Arrakis Technologies is building the platform to help biotechs like Agios stay independent and succeed commercially. Third Rock was the best partner for Agios - combining the Millennium playbook with world-class inventors led to 2 approved cancer drugs and an iconic partnership that set that standard. The Agios story unveils new opportunities in immuno-metabolism, metabolomics, and using spatial genomics for drug development. Just as Millennium led to several successful heirs, Agios is likely to do the same over the next decade.

Axial - New frontiers #3

New frontiers in life sciences

Axial partners with great founders and inventors. We invest in early-stage life sciences companies such as Appia Bio, Seranova Bio, Think Bioscience, among others often when they are no more than an idea. We are fanatical about helping the rare inventor who is compelled to build their own enduring business. If you or someone you know has a great idea or company in life sciences, Axial would be excited to get to know you and possibly invest in your vision and company . We are excited to be in business with you - email us at   info@axialvc.com

New frontiers #3

  1. ADCs

  2. Cultured meats

  3. Phenotypic screening

  4. Industrial enzymes

  5. Chemokines

  6. Building the US-version of WuXi

  7. Infectious disease

  8. Immunometabolism

  9. Fast following in drug development

  10. Single-cell sequencing

As always, any list or group of companies is not comprehensive; some companies are stealth-mode and it’s not appropriate to discuss them publicly and I don’t like spending too much time copying and pasting logos.

ADCs

Antibody-drug conjugates (ADC) combine the specificity of an antibody and the killing activity of a conjugated cytotoxic agent. The 3 main components of an ADC are:

  1. A monoclonal antibody - key features are minimal immunogenicity, high affinity for a cancer-specific antigen, stability for a longer half-life, and internalization to deliver the cytotoxic agent

  2. A cytotoxic agent - usually targeting DNA or microtubules to initiate cell death; key features are to ensure that the molecule can be conjugated to the linker, is water soluble, and is stable. The main cytotoxic agents used are calicheamicins, auristatins, and maytansines.

  3. A chemical linker - this is probably the most important part of an ADC and determines the drug’s PK/PD and stability; you have to make sure a linker only breaks in specific environments to avoid delivering the cytotoxic agent to healthy tissues. Linker’s come in 2 types: (1) cleavable depending on the environment to become activated and (2) non-cleavable that relies on lysosomal degradation of the antibody to release the agent. For example, Adcetris (Seattle Genetics) uses a cleavable linker sensitive to proteases while Kadcyla (Genentech) uses a non-cleavable one. 

Once an ADC is designed, the drug modality kills a cancer cell through a series of steps that have to be accounted for during the design process:

  1. The ADC binds an antigen through its antibody component

  2. The ADC/antigen complex is brought into the cell through endocytosis

  3. The linker is degraded. For ADCs with cleavable linkers, this step occurs within the endosome while in ADCs with non-cleavable linkers, the cytotoxic agent is released after protein degradation in the lysosome.

  4. The cytotoxic agent is released

  5. Apoptosis in the cell is initiated

With 11 approved ADCs and over 80 trials ongoing, the field’s progress has seen a recent acceleration with 3 approvals in 2019, 2 in 2020, and 1 so far in 2021. The field has made a major comeback over the last 2 decades - companies have learned from past failures and have developed new linkers and chemistries to overcome the setbacks, mainly toxicity due to non-specific linkers, that many ADCs faced. Many of the new ADCs approved rely on the design of Seattle Genetics’ Adcetris, which was approved in 2011 for Hodgkin lymphoma and ALCL:

The 4 main problems ADCs have faced are:

  • Off-target effects mainly due to imprecise dosing

  • Low penetration into a tumor

  • Potency of cytotoxic agent

  • Stability problems that lead to shorter ADC half-lives

Another way to look at ADCs is as a systemic chemotherapy. With the specificity of an antibody, the selectivity of toxicity for cancer can be increased. The modality has always had promising data in model organisms that failed to translate into humans due to dosing problems. Developing better linkers can reduce off-target effects and expand the therapeutic window for an ADC when trials are initiated. The main levers for linker stability are matching the type (cleavable versus non) to the environment and picking the right site on the antibody for conjugation. For the latter, the traditional conjugation sites on lysine and cysteine groups, due to their nucleophiles; however, specific sites are being engineered into antibodies to generate homogenous populations of ADCs.

For penetration, specifically in solid tumors, an ADC has to be able to generate a lethal concentration of the cytotoxic agent in all cancer cells. If distribution is heterogeneous, then some tumor cells are more likely to develop resistance to the ADC. A useful strategy here is to conjugate helper proteins, such as tissue-penetrating peptides, to the antibody. This is where costs can go up significantly during the design process for an ADC. Picking the right cancer model and measuring penetration across each run is essential for success.

Moreover, an ADC’s drug-to-antibody ratio (DAR) combined with the cytotoxin itself determines potency. Increasing the ratio of cytotoxic agency attached to the antibody increases potency, but can lead to off-target effects if a linker is not specific enough. This is where the design process becomes important because balancing the relationship between a linker and DAR can make or break an ADC’s potential for approval.

Finally, a major consideration for ADC stability is managing the hydrophobicity of the cytotoxic agent, which can lead to ADC aggregation. Linker stability in blood is important to reduce off-target effects. Ultimately, the key to success for ADCs is designing the entire drug rather than focusing on optimizing for one component over another. New opportunities in the field are:

  1. Inventing new linkers (mainly ones that react to new environments) and finding other ways to conjugate them to antibodies

  2. Developing new penetrating agents to add to an ADC

  3. Testing other cytotoxic agents

  4. Implementing new stability moieties to the antibody

Cultured meats

Cultured meats bring technologies developed over the last few decades in cell culture to food production. In 2013, the Post Lab at Maastricht University developed the first cultured meat product - a beef hamburger patty that cost around $300,000. This work led to the founding of Mosa Meat and kickstarted the current boom in cultured meats. In the current backdrop of the success in plant-based meats and technological advancements that make that same hamburger made in 2013 cost $10-$20 to produce today, cultured meats have the opportunity to transform the trillion-dollar industry spanning beef and poultry to seafood and pork.

With a biopsy from an animal, the cells (often stem cells that can be differentiated into muscle, fat cells) are grown in a nutrient-rich environment. Most growth medium contains fetal bovine serum (FBS), which is derived from dead calf blood. On a side note, there is a need for serum-free media. During cultivation, more than 1 trillion cells (strands of muscle) are grown and can form complex structures depending on the scaffold and mechanical stress applied.

This technology can potentially have several large effects on the meat industry and environment:

  • Having an increased ability to optimize meat for higher nutrition

  • Produce meat with different resources - cells versus animals. This has the potential to reduce land/water use and reduce methane emissions. Although more research is needed to understand these effects because the land impact is not exactly clear and fossil-fuel powered meat incubators might have the same carbon dioxide footprint as factory farming.

  • Reduce food contamination from salmonella and E. coli and the use of antibiotics. However, there are risks of using excessive hormones during the cultivation process for cultured meats.

  • Reconfigure the supply chain of protein sources

  • Contribute to feeding a larger global population with a current limit on arable land

With the recent commercial success of plant-based, and fungi-based, products driven by Impossible Foods and Beyond Meat, cultured meat companies can build on top of this market growth and increase consumer adoption for alternative meats. Current plant-based products are easier to produce than the cultured equivalent but have some limitations on texture, sensory experience, and taste. There is a large opportunity to bring new tools in machine learning and synthetic biology to solve this problem for plant-based foods. Cultured meats can solve this problem too by creating completely new products or as a hybrid with plant-based foods to make those more realistic. 

The key themes that will enable new product development and business growth in the field are (1) regulatory, (2) partnerships, and (3) consumer adoption. In 2018, the US Department of Agriculture (USDA) and the US Food and Drug Administration (FDA) announced a joint regulatory framework for cultured meat products - the FDA will regulate the early stages of product development (i.e. inspect culturing facilities) and the USDA will oversee the process closer to commercialization (i.e. product packaging, cell harvesting). Globally, India, the EU, Singapore among others have taken similar steps to provide clarity for companies. Partnerships are enabling new companies to gain distribution and institutional knowledge. Tyson investing in Memphis Meats is a great sign; however, more work needs to be done. Consumer adoption will be driven by cultured products having cost parity and similar texture/sensory/taste profiles as animal-derived meat.

Scaling production and new business models are 2 of the most important opportunities in cultured meats. Creating manufacturing processes that can produce 100Ks to millions of kg of meat at-or-near cost parity is essential for the field’s success. Moreover, moving beyond vertical integration for alternative foods in general has the potential to accelerate product development and create standards:

  1. Building CDMOs that cultured meat companies can plug into and avoid implementing manufacturing themselves. Incumbents could easily move into this space but are currently hesitant due to the lower product margins versus drug development. This work has the potential to bring cultured meats to industrial scale.

  2. Unlocking growth factors and standardizing growth media (along with bioreactors). Open source business models are an exciting solution to do this work. Another large opportunity is replacing FBS in large-scale production.

  3. Cell line development with different growth features and biobanking requirements

  4. New scaffolds to help cultured meats mimic the natural cuts from the fibers and connective tissue to blood vessels, nerves, and fat

  5. New culturing techniques to include oxygen perfusion and micronutrient enhancement

  6. If vertical integration is no longer a prerequisite, then cultured meat companies could license their products as meat products and ingredients

Phenotypic screening

Drug discovery as we know today really started off with natural products and finding chemical matter that leads to a biochemical change or a change in physical appearance. Phenotypic screening focuses on identifying phenotypes (and rescues) of interest. The power of genomics and structure-based drug design has revolutionized many parts of drug discovery by flipping this process to start at a particular target or pathway; however, phenotypic screening is still useful in some contexts where the underlying biology hasn’t been fully characterized. As a result, it is an old technology still useful for new problems. Moreover, this approach is not only useful in drug development. Phenotypic screening could bring transformative products in consumer products among other markets - Revela is leading the way here.

In 2015, a team at Pfizer published a paper defining the key rules for a phenotypic screen:

  1. Selecting physiologically-relevant cell types and models. Examples are iPSCs, organoids, and even animal models. In vitro models offer higher throughput versus in vivo models that might have more accuracy. Parallel Bio is the leader here.

  2. Designing assays the are relevant to a disease

  3. Defining assay endpoints that are similar to clinical endpoints. This is focused on using biomarkers or disease signatures that are matched to clinical samples.

These types of screens can be divided into 2 major steps: (1) simple primary assay to cover as much chemical matter as possible and (2) complex, physiologically-relevant models to hone in on interesting hits. After the model is established, a large library of molecules are screened to measure something like expression changes in a panel of proteins or cellular characteristics like proliferation. Secondary assays are used mainly to counter screen and filter out molecules that have general effects. This is the part where phenotypic screening has higher upfront costs versus target-based programs. Automation of high throughput screening helps here as well as filtering hits more accurately with transcriptomics, proteomics, unsupervised machine learning, high-content imaging, and other tools within the assays themselves. Hits are then grouped into mechanism classes to prioritize them with a focus on target deconvolution. Even though many approved drugs have been approved without a known target, this knowledge is incredibly important to de-risk a program. Excitedly, this information is much more accessible with the wide-array of tools: panels of known target classes are screened against, activity-based protein profiling is very useful along with compound-immobilized beads, photoaffinity labeling, and cellular thermal shift assays.

In drug development programs, the key focus is formulating and testing a hypothesis. In target-based discovery, the process tests a hypothesis (biological, clinical, commercial) by generating a hit for a given target then going to lead selection and optimization and finally into pre-clinical/clinical testing. Whereas, phenotypic screening generates a hit for a given phenotype then the trick is to find the target. The latter process can get tricky sometimes and can make lead selection a bit more difficult, and as a result, better frameworks are needed to make phenotypic screening as structured as target-based programs:

  • In vitro models that are relevant for more diseases. Patient-derived cell lines are a step forward here, but could lead to unexpected variance in assay conditions. Co-culturing is also another useful approach but could impact throughput.

  • With more complex assay designs, identifying important variables that influence reproducibility and disease relevance. 

  • “Chain of translatability,” an idea from biopharma, to connect assay endpoints to a clinical outcome.

What other tools are needed to advance phenotypic screening?:

  • Increasing the screening throughput of more complex models like organ-on-a-chip and organoids. 

  • Translational work to build models for diseases that have recent breakthroughs in their mechanisms-of-action (MoA) and hopefully genetic drivers especially in CNS. On a side note, phenotypic screening is likely going to make a very large impact on longevity drug development.

  • More accessible proteomic tools particularly for activity-based profiling during the target deconvolution step. This would speed up the process and lower the barrier to entry to use phenotypic screening because this step is the last and sometimes hardest.

  • Simply, more case studies and work to merge phenotypic and target-based screening. The idea would be to create a database enabling a phenotypic match from a virtual screen one day.

Industrial enzymes

Enzymes are the workhorses of the cell and a core part of industrial biotechnology. The thesis is to collapse supply chains with biology. And enzymes are uniquely suited to do this. They catalyze important chemical reactions to produce new products from detergents to specialty chemicals. Overall, the field has undergone roughly 4 eras:

  1. Enzymes from animal sources - early 1900s

  2. Enzymes from microbial sources - mid 1900s

  3. Enzymes from genetic engineering - 1980s to now (Novozymes has been dominant here)

  4. Enzymes from software - now (Aperiam Bio leading the way)

The industrial enzyme market started taking off in the 1960s with what would become Novozymes starting work in the 1940s among other companies. In the 1980s, in the backdrop of the biotechnology revolution, companies like Genentech with Genencor (now part of DuPont), Novozymes, and Genex started using cloning and genetic engineering on bacterial/fungal strains to start producing enzymes with higher yields and new functions. Genencor in particular did great work to bring new enzymes to products like Tide and ethanol. The workhorse for this period of growth was the host, bacterial or fungal, and deep-tank, fed-batch aerobic fermentation. On this, there is a large opportunity to pick different hosts specific for a given problem. Colorado Biofactory is leading the way here.

With applications from food and plastics to energy and textiles, the addressable market for enzymes is in the $10Bs. The key theme for enzymes is replacing organic chemistry. Organic synthesis often leads to environmentally-harmful byproducts but have logical steps to produce a target molecule. Whereas enzymes are useful to produce a natural reaction/product but are not characterized well enough to significantly eat away at organic chemistry’s use. This creates an opportunity to go through the large search space of enzymes and the reactions they catalyze to map out specificity, catalytic rate, and activity. A database of millions of these enzymes along with these features could create the standard. This would enable logical steps to use enzymatic reactions to produce a target molecule. 

Bringing more engineering principles to this field is an important driver. Synthetic biology and new tools can build large libraries of enzymes and screen for functional variants. The key themes here are:

  1. Standardization of parts – makes screening and discovery reproducible and scalable

  2. Coupling screening to the synthesis and assembly of DNA

  3. Modularity of parts between multiple chassis

  4. More predictable outcomes – as more data is collected through screens various combinations of components will be discovered to work together

These principles create a tight feedback loop between new models and biology experiments to make novel predictions. As more experiments are conducted, a company can build a large knowledge base of what experiments not to do and reduce the search space to valuable enzymes with new functionality.

Successful case studies like Novozymes, Solugen, and Genencor initially pursued low-hanging fruit and expanded their platform to solve larger-and-harder problems. BluumBio and Rubi Laboratories among others are emerging companies to keep an eye on. Broadly, applications for industrial enzymes span drug development, consumer products, and industrials. Software unlocking new enzymes through better genomics analysis (finding new biosynthetic clusters) and design can discover enzymes that catalyze difficult chemistries (i.e. C-C) and expand the toolkit to help patients and our environment.

Chemokines

Chemokines are the trafficking control system for the human immune system. They act as signposts to get different immune cells into certain organs. With over 50 different chemokines and 20 receptors (which are GPCRs), they are a class of cytokines that induce chemotaxis in nearby cells.

In 1987, the first chemokine, CXCL8 (IL-8), was cloned for work around figuring out which factor(s) monocytes were secreting to attract neutrophils: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4459227/ & https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4456961/

This sparked large-scale research efforts to clone chemokines and their receptors in the late-1980s and 1990s. This work led to the approval of drugs targeting CXCR4 and CCR5 to treat HIV and C5aR for ANCA-Associated Vasculitis (AAV). Despite the large number of new drug targets for inflammatory diseases generated from this work, dosing and target selection have been barriers for more successful chemokine-focused medicines.

As a result, there is a large opportunity to develop new medicines that target the chemoattractant system: (1) small molecules, as well as antibodies, to selectively target chemokines and their receptors and (2) engineered chemokines:

  • For autoimmunity, chemokines direct immune cells to self tissue

  • Viruses and microbes use chemokines to deceive the immune system

  • Vaccine responses can be improved with chemokines

  • Chemokines drive inflammatory diseases like AAV, diabetic nephropathy (DN), inflammatory bowel disease (IBD; targeting CCR9), and rheumatoid arthritis (RA; CCR1) 

  • In cancer, angiogenesis is influenced by chemokines

  • Other diseases like Type 2 diabetes (T2D; CCR2) and chronic kidney disease (CKD) have also been found to be driven by inflammation where chemokines play a major role

The complex orchestration of chemokines and their various interactions with receptors make this space hard-to-drug. Not only does the target matter but the location and time of intervention are important as well. New opportunities with chemokines are centered around target selection for a disease, chemistry, and in vivo

dosing:

  • Mapping in vivo interactions - figuring out systematically which chemokines bind specific receptors and at what doses and locations

  • Determining the biological response for each chemokine/receptor pair - different chemokines and receptors can have multiple functions depending on tissue, timing, and pairing. Excitedly, a small set of cells with activated chemokine receptors can lead to a large-scale immune response.

  • Immune cell subsets - which immune cell subsets respond to individual chemokines? For example in monocytes, CCR2 is a marker for an inflammatory class and CX3CR1 is one for the resident subset of monocytes. 

  • Dosing - figuring out dosing in vivo; most work has been in vitro. Moreover, to generate a therapeutic effect, a large proportion of chemokine receptors need to be inhibited continuously, increasing the required critical dose. This requirement can lead to ADMET issues.

  • Target selection - determining whether a chemokine is redundant or not for a specific disease. For example, CCR7 is activated by both CCL21 and CCL19. But CCL5 also CCL5 activates 3 chemokine receptors: CCR1/3/5. Moreover, receptor internalization is different for each interaction. Chemokines have been seen as a hard-to-drug class of targets due to this potential redundancy.

  • Antibodies and engineered chemokines - develop antibodies for specific chemokines or receptors. Targeting chemokines, and using chemokines as a drug, requires knowing the concentration of the ligands in vivo. Also, drugging the receptors can become complex since they are GPCRs and have multiple transmembrane regions.

Ultimately, there is a large opportunity to bring new chemokine medicines to patients. The low-hanging fruit is to develop models and tools to predict which chemokine/receptor pair is involved in disease and develop a drug candidate to target the pair. In short, new methods are needed to figure out which part of the control system to block or add to?

Building the US-version of WuXi

Driven by the theme of onshoring biomanufacturing, it is imperative that a US-version of WuXi is built. Resilience has a shot, but so many more companies need to emerge to take advantage of the opportunity and reinvent the CRO. WuXi was founded by Ge Li in 2000 as a services company for synthetic chemistry. Each and every year the company rolled out a new offering: manufacturing, bioanalysis, formulation, AMDET. Charles Rivers Laboratories had the opportunity to buy WuXi in 2010 but ended up nixing the deal. Over the following decade, WuXi grew into one of the largest and most innovative contract-research organizations (CRO) in the world and positioned itself to dominate discovery services globally.

WuXi has been able to win for 6 key reasons:

  1. Labor arbitrage - being in China gives WuXi access to a large amount of talent at much lower wages

  2. Scale - Wuxi has a suite of services done at a large-scale built over 2 decades. The company offers services for a wide range of modalities from small molecules to biologics and cell/gene therapies. They provide discovery, development, and manufacturing services and even offer things like medical device testing and genomics products.

  3. Flexible deal structures - WuXi has been flexible in the terms it provides to its customers. The company offers fee-for-service contracts with various milestones and royalties on some of the products if appropriate. Beyond services, WuXi is also aggressive with joint ventures and investing.

  4. Founder-led company - Ge Li still runs the company and is motivated to complete his vision

  5. Regulation - if a company outside of China wants to bring a drug, especially biologics, to China, they must meet regulatory requirements by repeating the development and manufacturing process in China again. This is a lucrative business for WuXi.

  6. Standardization - WuXi gets lock-in by getting customers early and fully-integrating from discovery to manufacturing

These 6 drivers enabled WuXi to dominate services from discovery to development and finally to manufacturing. Now WuXi has set its sights on internal product development. The first example of this was WuXi spinning off its biologics manufacturing business into WuXi Biologics in 2015. This gives a glimpse into the future for WuXi in cell and gene therapies and beyond. WuXi Biologics focuses on fully integrating services for biologics drug development: (1) drug discovery, (2), pre-clinical work, (3) phase 1/2 clinical development, (4) services for pivotal studies, (5) commercial manufacturing. This end-to-end platform allows any customer to plug into WuXi but more importantly makes biologics development a lot more accessible. For WuXi Biologics, it’s probably more valuable that they get companies as early as possible because they will spend increasingly higher amounts as the research and trials progresses. Because WuXi Biologics is such an essential component for smaller companies with less resources and to larger ones that want access to the Chinese market, the company has been able to get royalties on some of the products produced from their services and are set up to build an internal pipeline.

It seems obvious that WuXi has the potential to take control of the global CRO market. With such a dominant position, there is a need to build a US-version. The end-state is easy to comprehend - it’s WuXi. But what are good first moves?:

  • Focusing on higher value products like cell and gene therapies

  • Building technology-enabled services that can take on WuXi’s labor advantages

  • Rolling up services to become fully-integrated at least for a region or specific area then expanding

Infectious disease

New medicines and vaccines for infectious diseases are always needed but the business of infectious disease has not been historically rewarded over the last few decades. Events like the COVID-19 pandemic reinforce the need for more sustainable businesses in the field - multidrug bacteria resistance, neglected tropical diseases (NTD), and communicable diseases in developing nations hopefully are given more attention and capital over the next few years.

However, building a business to develop new antibiotics and other infectious disease drugs are not currently rewarded. Starting a business to prepare for pandemics was not highly valued until COVID-19 hit. Finally, global public health is required to save more lives and a prerequisite for economic growth, but most companies do not currently have a global health mandate. To explore the potential of building a large business in infectious disease, it is worth profiling a few case studies who have had success in the field: Gilead, Moderna, Vir, Regeneron, AbCellera, Distributed Bio, Novavax, and BioNTech.

Gilead was founded in 1987 by Michael Riordan to cure viral diseases. Riordan had spent time in East Asia on a Luce scholarship after college. While abroad, he worked at a malnutrition clinic and came down with dengue fever. The exposure to another country’s healthcare system and experiencing the lack of medicine for viral disease inspired Riordan to go to medical school at Johns Hopkins and business school at Harvard to learn more about the business of medicine.

After working as a VC for a ~year, Riordan started Oligogen to use antisense DNA, an emerging technology in the 1980s, to selectively target viruses. The company’s name changed to Gilead after the Balm of Gilead, a medicinal product historically used in the Middle East. Riordan set Gilead on an ambitious path to cure viral disease and did everything to build a great team to accomplish this vision even persistently working to get Warren Buffett involved - https://www.scribd.com/doc/208120113/Michael-L-Riordan-the-Founder-and-CEO-of-Gilead-Sciences-and-Warren-E-Buffett-Berkshire-Hathaway-Chairman-Correspondence

The company was founded in the 1980s in the backdrop of the HIV epidemic. The same year Gilead was founded, 1987, AZT had just been approved by the FDA to treat AIDS patients. This created an urgency for Gilead and other infectious disease companies to develop new medicines for HIV/AIDS patients. Gilead’s timing was likely a major driver for its success - the need to develop new viral drugs was more acute than ever.

Early on, Gilead executed a deal with Glaxo for $8M to use their antisense technology in cancer. This enabled the company to have the capital to expand the purview of their platform: the most important part of this work was funding Antonin Holy’s lab. This led to Gilead’s first FDA-approved medicine in 1996 - Vistide for cytomegalovirus (CMV) retinitis in AIDS patients. Gilead continued to churn out new approved drugs, like Truvada for HIV infection, mainly from the golden goose they had in the Holy Lab. This work helped make HIV/AIDS a manageable disease for most patients.

In 2011, Gilead acquired Pharmasset for ~$11B. This acquisition gave Gilead the rights to two transformative medicines in Hepatitis C: Sovaldi and Harvoni, which ultimately cured the disease. Moreover, this event probably was the keystone moment for the significant growth of the biotechnology industry in terms of capital deployed and returned over the last decade. The size of the acquisition was one that hadn’t been seen since the early 2000s.

Another infectious disease company is Vir Biotechnology. Founded in 2016, Vir assembled an all-star team to program the human immune system to fight infectious disease. The business model was very broad - developing multiple modalities from antibodies to siRNAs, and in-licensing assets, building an internal pipeline, and funding academic labs. The company is still relatively early especially when compared to Gilead. But Vir has played a role in the response to the COVID-19 pandemic. What makes Vir unique is its focus on the immune system rather than a particular pathogen.

Moderna has played an essential role during this pandemic. Their recent vaccine approval, along with BioNTech/Pfizer, is a major breakthrough for patients and society. It also validates their strategy to use mRNA drugs as a platform for infectious disease and beyond. The modality has had a lot of promise over the last 7-8 years, but COVID-19 accelerated what would have taken ~5 years to a few months. With the right timing and business model, Moderna is set up to bring mRNA medicines to a wider set of patients over the next few decades.

BioNTech is very similar to Moderna. With similar technology, BioNTech is also positioned to bring mRNA medicines to more patients. However, the company is also ingesting other modalities, similar to Vir, into its platform.

Novavax has been around for a long time. The company was founded in 1987 (same year as Gilead) to develop vaccines for infectious diseases. With no approved medicines over its lifespan, Novavax has relied on its platform and funding from foundations and public institutions (i.e. BARDA and Bill & Melinda Gates Foundation) to stay alive. The platform is centered around a set of recombinant nanoparticles and their Matrix-M adjuvant technology with the goal to generate strong patient immune responses to a vaccine. However, the Moderna/BioNTech vaccine data in COVID-19 might make this work somewhat irrelevant for this pandemic. But Novavax has vaccines in development respiratory syncytial virus (RSV), seasonal influenza, pandemic influenza (H1N1, H5N1), Ebola virus, among others. There might be an opportunity for them to update the COVID-19 vaccine every 2-3 years depending on the evolving strains. 

Other businesses like Regeneron, AbCellera, and Distributed Bio (now part of CRL) are focused on developing antibodies. They have more flexible business models where they license out assets in other disease areas and have the resources to invest in their underlying technology. This flexibility has enabled all 3 companies to rapidly respond to the COVID-19 pandemic.

Beyond drug development, curing infectious diseases requires a substantial role from public health. Everyone needs access to clean water and better nutrition. Maternal/reproductive health plays an important role. How do we increase accountability and prevent people from making decisions about other people’s health without consequences? Public health, things like vaccinations, trash pickup, sewage, and more, has a large impact on economic growth, lifespan, and wellbeing - https://oecdobserver.org/news/archivestory.php/aid/1241/Health_and_the_economy:_A_vital_relationship_.html:

“The effects of health on development are clear. Countries with weak health and education conditions find it harder to achieve sustained growth. Indeed, economic evidence confirms that a 10% improvement in life expectancy at birth is associated with a rise in economic growth of some 0.3–0.4 percentage points a year.

Disease hinders institutional performance too. Lower life expectancy discourages adult training and damages productivity. Similarly, the emergence of deadly communicable diseases has become an obstacle for the development of sectors like the tourism industry, on which so many countries rely.”

Can a sustainable and large infectious disease business be built without substantial government backing? Or is government backing a prerequisite? It sure seems so from the case studies. How can life sciences companies integrate with public health mandates? Can we set up a reward system to incentivize entrepreneurs and companies to eradicate diseases like malaria? Can we do the same for the deployment of rapid infectious disease diagnostics? What areas are just intractable and the realm of nonprofits? Can a universal vaccine be developed? 

Key lessons:

  1. Timing has been important. The HIV/AIDS epidemic helped kickstart Gilead and Novavax. The COVID-19 pandemic has done the same for BioNTech and Moderna.

  2. Platforms are essential to also work on other diseases like cardiovascular and cancer. This enables a company to raise capital to invest in technology and have the capabilities to work on infectious diseases. During most eras, capital has not been available to standalone infectious disease companies. Vir was an exception mainly due to the team.

  3. There is an advantage to being centered around human immunology instead of pathogen focused. Moderna and BioNTech validate this with their COVID-19 vaccine. This enables a more rapid response and a broader scope. Inflammatix is leading the way here on the diagnostics side.

  4. There is an opportunity to connect drug development with a public health initiative. Beyond COVID-19, other opportunities are in NTDs and malaria.

  5. Each case study, with the exception of Novavax, executed some form of the Merck model: focus on higher-value indications to subsidize programs with societal impact. In 1987, Merck freely distributed their medicine for river blindness. This was only possible because Merck had gotten Mevacor approved as the first statin the same year.

Immunometabolism

Metabolism creates a set of diverse tumor microenvironments (TME) across cancers and patients. Importantly, metabolism within each TME has a significant effect on immune cell function and the ability for immunotherapies and cell therapies to treat solid tumors. Particularly, immune and cancer cells actually converge on their metabolic pathways competing for the same resources to grow. This creates an environment to inhibit immune cells in the TME but also an opportunity to target metabolism to help immune cells kill cancer cells.

Companies like Agios (targeting IDH; although they recently sold off their immunometabolism portfolio to focus on PKA) have been built on the premise of targeting metabolism to treat cancer. However, the full potential of the field is still to be realized mainly due to the diversity of immune cells within the TME and the unique role of oxidative phosphorylation and glycolysis along with other pathways within immunometabolism.

Immunometabolism is such a complex field. Immunology is already hard enough. Adding metabolism on top of it only makes figuring out cause/effect relationships even more difficult. Opportunities in the field are:

  1. Measuring the metabolism difference between a given tumor and immune cells in the TME

  2. Assessing the metabolic requirements for each immune effector cell

  3. Using this information to find weak spots in metabolism to pursue and develop new medicines. The hard part here is the differences might be so slight that therapeutic windows could be very narrow.

Fast following in drug development

There is a large opportunity to develop new business models to be a fast follower in drug development. Many branded drugs often have consistent price increases (>10% annually) over the lifetime of their patent exclusivity without any underlying improvements. This creates a market opportunity for other companies to fast follow on the branded drug and undercut on price by up to ~50%. Many biotechnology companies in China and EQRx are leading the way. EQRx recently reported data on an anti-PD-L1 antibody they licensed from CStone in China - this program serves as a proof of concept for the business model. The idea is to bring a Southwest Airlines or Danaher business model to drug development. Companies in China have been able to do this at least in their home country mainly due to labor advantages, larger patient populations, and streamlined regulatory processes. EQRx boldly got started to do something similar in the US and do fast-following more systematically.

To successfully have a shot at building an enduring business around a fast follower model a few things are needed:

  1. Having an ability to design or discover patent-breaking drugs

  2. Also, having the ability to execute more efficient clinical trials. A fast follower has to get to market before the branded drug goes generic. This probably means that the company has to initiate a program before a branded drug is actually approved. A company has to build forecasting tools to not only predict the success of their own drug but the success of the branded drug in trials.

  3. Pursuing only established targets and mechanisms, which hopefully lead to lower clinical failures for the fast follower. This is also enabled by the increasing power of genomics over the last 2-3 decades; target discovery has been mostly commoditized.

  4. Identifying markets where the fast follower model can expand the market or get a higher share (figure below from EQRx). If the fast follower is charging lower prices, there is a possibility that more patients, and their plans, can afford it. Other elements like competition, reimbursement, and market uptake are important here.

  5. Establishing strong payor relationships to ensure patient access and quick market uptake

Ultimately, this model has the potential to commoditize some parts of drug development. For diseases with established targets and mechanisms, a company could build moats around  commercialization, capital structure, and forecasting rather than discovery. This model is fraught with risks. Clinical trials are probably going to fail - is a company capitalized well enough to weather this? Executing faster trials is still hard even for validated targets. Patent issues can arise from this model. Incumbents will fight back - biopharma will use vouchers to shift who pays their high prices, a PBM could exclude the fast follower from their formulary, and more.

Source: EQRx

The fast follower also has the potential to be one of the first drug companies to actually delight their customers. So who could they be?:

  • Patients - building a brand around lower pricing and increased access

  • Drug companies - a source of licensing opportunities for the fast follower. A straightforward strategy is to bring assets from China to the US.

  • Payors - probably the most important relationship to help the fast follower lock in purchases of their drug and market access

  • Networks - hospitals in particular help the fast follower more easily recruit patients and run more efficient clinical trials. The new CMS rule enacted in 2021 on price transparency also helps the fast follower - https://www.cms.gov/hospital-price-transparency

  • Physicians - facing off against the large sales forces of incumbents, fast followers need to show non-inferiority and get payor help here

  • PBMs - with recent consolidation of pharmacy benefit managers (PBM), a fast follower could try to find a way to reward them or bypass them completely. PBMs are probably the stakeholder to focus on the least.

Single-cell sequencing

Single-cell sequencing, led by tools developed by 10X Genomics and Akoya Biosciences, are enabling many new avenues in research and applications in drug development and diagnostics. Up to millions of cells can be manipulated and profiled to measure their heterogeneity and individual gene expression. The human genome has around 30K genes that produce over 100K mRNAs (splicing creates more variants). Each cell expresses around 10K genes with a few thousand having cell-specific patterns.

In 2003, the Human Genome Project was completed enabling new applications in particular genome-wide association studies (GWAS). The goal of this work was to build a database of genetic variants and link them to different phenotypes and disease. Due to the underlying complexity of biology, GWAS never fulfilled the promise of genomics. Large-scale sequencing efforts driven by the super-exponential decrease in costs and Illumina, helped increase access to data and scale up this work. The next step has been sequencing single cells to understand cell-to-cell variation, not just human-to-human, and their genomes, epigenomes, transcriptomes and proteomes. New tools are needed.

Beyond single-cell sequencing, we are also moving toward perturbing biological systems to measure changes within them. A catalog of genetic parts are being discovered, which is setting up for large-scale perturbation studies of single-cells. Coral Genomics is a leader here. This work could finally fulfill the original vision of the Human Genome Project: linking genetic variants to different traits and diseases.

What are some of the important problems to solve in single-cell sequencing?:

  1. Integrate new single-cell measurements and standardizing them between samples: DNA, RNA, proteins, methylation, chromatin accessibility, spatial arrangement

  2. Increasing sorting throughput/efficiency in single-cell sequencing. This would lead to large gains in resolution and the number of cells profiled. In some situations, due to these limitations only a few 100 cells can be measured. This would increase the number of cell types we discover along with their developmental trajectories.

  3. Better analysis and visualization tools to analyze higher dimensional data. It seems there is a Something-seq paper every day, and this work has created a tremendous amount of data. Sometimes the data is pretty noisy due to low capture rates, batch effects, PCR biases, and more. Moreover, better tools to automate cell-type annotation are needed.

Single-cell sequencing is now building a parts list of individual cells. Whereas we had a parts list for individual humans but they did not fully recapitulate the complexity of biology. Single-cell probably doesn’t create a full picture by itself, but it is a major step forward in understanding biology and disease. For example, the human kidney, interactions between immune cells and antigens, tumor microenvironments, and more have been more accurately profiled because of these tools. With a better mechanistic understanding, we have a better shot at creating better medicines and products for human health.

In short, the main question in single-cell sequencing is - what does 10X or Akoya enable? 10X dominates the R&D part of the market and Akoya dominates clinical applications. Both companies are in an incredible market position and would be hard to usurp. As a result, new companies can think through new applications in oncology, autoimmunity, neurodegeneration, and more to build on top of. New immune cell profiling companies are springing up to find novel mechanisms to pursue in immuno-oncology and autoimmunity. 23andMe could even start integrating single-cell sequencing to their product line. New diagnostics can be created along with more accurate organoid models. Finding new insights from the spatial arrangements within cells is a new frontier. GWAS showed the power of new data helping generate better hypotheses. Single-cell sequencing is some orders of magnitude better in terms of scale and resolution. With this new data, inventors and founders should be able to ask better questions to understand biology and disease.

Axial - Veeva

Surveying great inventors and businesses

Axial is really excited to announce our new podcast and first episode with the founders of Unnatural Products, Cameron Pye and Joshua Schwochert on Beyond Undruggable: The Future of Drug Development: https://anchor.fm/axial Be sure to follow us wherever you listen to podcasts as we have an all-star lineup of founders and inventors coming up on the Axial Podcast.

Also, check out the BioBuilder Jobs Board Axial recently launched: https://pallet.xyz/list/axial/jobs If you are a founder or working at a life sciences company, feel free to post any open job postings you have. If you're looking for a job, check out the jobs board and apply to opportunities that would be a fit. Really excited to see this grow and become more useful to everyone.

Axial partners with great founders and inventors. We invest in early-stage life sciences companies such as Appia Bio, Seranova Bio, Think Bioscience, among others often when they are no more than an idea. We are fanatical about helping the rare inventor who is compelled to build their own enduring business. If you or someone you know has a great idea or company in life sciences, Axial would be excited to get to know you and possibly invest in your vision and company. We are excited to be in business with you - email us at   info@axialvc.com

Veeva was founded in 2007 by Peter Gassner and Matt Wallach to make life sciences more efficient and bring new medicines to patients. The company is one of the most important case studies on how to build a standalone software business in life sciences. Focused on managing clinical data and communications for life sciences companies, Veeva pioneered the vertical Software-as-a-Service (SaaS) business model by building out one of the first industry-specific clouds. 

Peter had previously worked at PeopleSoft and Salesforce before starting Veeva. Matt had built the life sciences business at Siebel Systems, which was acquired by Oracle. And actually, a lot of Veeva’s early customers came through Matt’s network. Furthermore, Veeva’s first product was built on top of Salesforce. Veeva had both lower upfront product development and customer acquisition costs, which enabled the business to scale pretty efficiently. Moreover, these experiences at 3 iconic software companies set the initial conditions for the company: Salesforce was the first pioneer in SaaS and while there, Peter had the idea to make a cloud product specific to an industry. At the time of founding, SaaS was a somewhat controversial business model. A vertical SaaS business was even more controversial and seen as pursuing markets too small to build anything venture-scale. However, Peter had the perfect personality to build Veeva since he never liked to follow the herd. With high regulatory requirements and operating needs and a large product revenue, the life sciences market was a perfect fit for a vertical SaaS product and was still open for Veeva to take over with their CRM product before Salesforce could.

With Veeva, Peter was focused on building a lasting company, and a lasting company “needs a good profit margin.” So he set running a profitable business as an immutable parameter and built around it. Veeva raised $7M in venture capital, only using $3M of it, before going public. This scarcity of capital focused the company across sales, deal making, and team building. Early on in Veeva’s lifespan, Peter only set quarterly goals for the first few years then graduated to annual then 5-year goals as the company grew. Throughout this adventure, Peter was pressured to spend more but he took a conversative approach to minimize dilution and keep Veeva focused. 

The life sciences market has a lot of high value customers since they are using Veeva to manage clinical trials for potential billion-dollar drugs. So Veeva’s first customers actually helped fund the company and gave Veeva the ability to generate enough cash flow to make the business sustainable pretty quickly. Peter made sure Veeva did deals clearly with their customers, laying out what is expected from both the product and customer. This is represented in their values that has driven their decision making since founding:

  1. Be frugal with capital and selective with who is hired

  2. Get the product out quick

  3. Sell product at a good price to as big of a customer as possible

  4. Don’t give away professional services (often services are not profitable for most SaaS companies). For Peter, a product comes with a certain amount of services like customer support, training, among others.

These values drove long-term growth for the company. With an addressable market that has customers numbering in the hundreds, not thousands or tens of thousands, each relationship for Veeva matters. For example, Peter has made sure to structure Veeva’s sales team to not maximize topline revenue in the short run. Veeva makes sure not to put too many sales reps in the field or over-cover an area or set of customers with a rep. This is all to make sure the customer doesn’t feel like they are being squeezed to buy more software or services than they need. The company plants seeds to generate sales 4-5 years down-the-line. Things like this weren’t all figured out on day one. Peter and the team had to learn along the way, but their values were the guiding light. 

Execution has been the most important part of Veeva (i.e. “Execution is enduring”). Veeva built a foundational software product for life sciences by putting the customer at the center. With over 50% market share in the life sciences customer relationship management (CRM) market, Veeva has proven out the potential of building a vertical SaaS business in life sciences. However, Veeva is working to expand into other industries like cosmetics and industrial chemicals as well as developing other product suites for their life sciences customers. For Peter, “a company that can make its second act bigger than its first act has some long-term ability to reinvent itself.” On this theme, Peter makes it a habit to always think about what he calls the adjacent possible: every 3 weeks or so he engages with someone building software in another industry, a doctor, or someone else adjacent to Veeva’s focus. This helps Peter to avoid becoming myopic, bridge new ideas, and cross-pollinate. I am sure this has helped Veeva to expand its purview and not become stale. This focus on the customer and always staying on the cutting edge has made Veeva one of the best case studies on how to build a large and scalable software business in life sciences.

Key findings

  1. Veeva has built the leading end-to-end software product in life sciences. They are helping make drug development global and removing a lot of operational barriers for a drug to get from the lab to the patient. Veeva is expanding into data and patient relationships and has the potential to build out pretty strong multi-side network effects.

  2. An important part of Veeva’s success was its sales strategy. Because it had a relatively smaller number of customers to sell to, each relationship mattered a lot. The company relied on reference selling where Veeva’s key customers kickstart the sales process to other potential ones.

  3. Veeva used a reverse churn strategy to expand their addressable market. With hundreds of customers, the company had to figure out a way to upsell them without deteriorating their brand. The first step to have a shot at reverse churn is getting lock-in. Veeva does this with integrations, data, and services to increase switching costs for the customer.

  4. Software companies in life sciences will meet at a Milvian Bridge as their product offerings increasingly overlap. Veeva shows that software in highly regulated industries requires different growth strategies. The company used the Salesforce CRM platform and prior relationships to execute a brilliant top-down strategy. There are now develop-driven strategies and increasingly fintech ones to get users. And as the physical world of the lab converges with software, simulated environments may increasingly become a standard. As a result, interoperability, software persistence, and scale across life sciences and healthcare becomes much more important in this scenario.

Technology

Veeva’s first product (Commercial Cloud) was actually built on top of Salesforce. Veeva pays a licensing fee per seat to Salesforce with a ~15% royalty fee and a non-compete agreement to ensure Veeva can’t use its CRM product in non-life sciences’ markets. This type of arrangement helped Veeva efficiently launch their first commercial product. From day one, this type of deal has a large impact on gross margins. But for Veeva this was acceptable to get a faster go-to-market (GTM) and lower initial R&D spend. Veeva started building up product moats by customizing the Salesforce CRM for life sciences companies to manage clinical data from regulatory submissions through commercialization.

So the Veeva Commercial Cloud is essentially Salesforce with add ons built on top. Veeva adds pricing power and a moat with the modules it builds on top of Salesforce rather than the core software itself. More often than not, a new Veeva module (at least 9 right now) for Commercial Cloud can add ~10% to up to 20% more revenue for the product. Think of it as a suite of tools to help drug developers manage their interactions with clinicians, sales representatives, and a lot more. For example, if you’re a sales representative at a large biopharma company, Veeva’s CRM helps you manage your contacts on the clinician side from making appointments, having clinical trial results to show during visits, and engagement through automatic emails. Similarly, the CRM helps people doing clinical research manage patient relationships and data. So the software platform spans everything from marketing, sales to collaboration and data management. This CRM forms the basis of Vault, Veeva’s 2nd and larger product.

Now, Commercial Cloud represents a little under half of Veeva’s revenue while Vault makes up over 50%. This is pretty rare - a company’s second product that is larger than its first. Roughly, Commercial Cloud dominates late-stage/post-market activity, while Vault is the standard for earlier stage work. The major inflection point for Veeva was around 2010 when they decided they wanted to become a multi-product company. Vault, launched in 2012, is a content management system that helps users organize their various documents and workflows from clinical trials to manufacturing and sales. And Veeva built their own infrastructure for this. The CRM helped enable Vault by allowing Veeva to use a reverse churn strategy and upsell existing customers rather than solely find new ones. Moreover, Veeva’s initial success with its CRM was driven by replacement cycles for Siebel systems and a general move to the cloud. Similarly, Vault just replaced a lot of the old content management systems life sciences companies were using.

Content management was a market with a lot of single-client hosted products that were often on-premise. Vault was built to be a unified cloud version. Veeva generates pricing power here by offering individual applications (i.e. regulatory, marketing) for Vault or an all-in-one pack:

  • Commercial Content Management - puts every document within a life sciences company in one place and allows the user to manage which people get access to certain data and have a lot more visibility on various processes from trials to sales and manufacturing.

  • Development Cloud - allows the user to build new applications like standard operating procedures (SOP) to new manufacturing protocols.

  • Safety - helps users maintain compliance across their product development cycles and ensure pharmacovigilance.

  • Medical Device Suite - focused tools for medical device development, quality control, and commercial rollouts.

Vault’s subscription revenue is well over $500M growing >40% over the last 4 years. Veeva took a similar strategy, as with its CRM, to use modules to increase Vault’s growth. Without paying Salesforce a royalty fee and as Vault becomes a larger part of Veeva’s business, their margins ought to increase significantly. Ultimately, Vault allowed Veeva to expand the universe of companies it can sell to. With Commercial Cloud, Veeva could sell to the 500 or so life sciences companies with a commercial product. Vault allowed the company to sell to well over 10,000 life sciences companies in earlier stages of R&D and product development. The company went public in 2013 and since then Vault has become its crown jewel. Around 2016/2017, Veeva expanded its purview to help patients too: finding the right clinical trial and driving new medicines forward.

Veeva uses its CRM and Vault product as a platform for next-generation products. Right now, its focus is on data and patient-facing products. Veeva has built out Network, OpenData, and Data Cloud to manage master data, connect to reference data, and provide access to patient/prescriber data, respectively:

  • Network - a centralized hub for customer data.

  • OpenData - connects partners with a validated reference data set on patients, clinicians, and more.

  • Data Cloud - a large dataset of patients and drug prescribers in the US; Veeva makes this data available to drug companies. Recent M&A from Veeva buying Crossix Solutions and Physicians World supports their strategy to become dominant in offering data to their customers.

Veeva is also moving toward engaging the patient. MyVeeva is a platform to connect physicians and patients with life sciences companies. For physicians, the application allows them to more easily communicate with trial sites, healthcare liaisons, and sales people. On the patient side, MyVeeva allows them to more easily connect with clinical trials and complete consent forms. This all leads to Veeva gaining more momentum in decentralized clinical trials and expanding their customers’ ability to engage patients and sell to healthcare professionals. From Vault to these new products, Veeva has focused on reverse churn and building a fully-integrated product suite in order to build substantial moats across the entire user experience.

Veeva, the founders, and the entire team have done an incredible job to build enduring software products in life sciences. There are some lessons that can’t be replicated and a few that can:

  1. Lower technology development costs - Veeva handed over ~15% of its CRM revenue to build its life sciences applications on Salesforce’s platform. This helped reduce the capital required for Veeva to launch. There might be opportunities to do something similar today on top of a platform like Snowflake.

  2. Lower customer acquisition costs - most of Veeva’s early sales win came from Peter and Matt’s network. They were essentially replacing older on-premise software from Siebel/Oracle and IMS Health with Commercial Cloud.

  3. The move to the cloud - a major tailwind for Veeva over the last decade and more has been getting life sciences companies off client-side software and into cloud applications.

  4. Bundling software to gain pricing power - in life sciences, content management and approval were separate from customer relationship management. Veeva has built out Vault and its CRM and removed a bunch of friction and manual processes in drug development.

  5. Built a moat around the services - beyond software, a major advantage for Veeva has been the consulting and business services wrapped around Commercial Cloud and Vault. The company has dedicated teams to help customers build extra functionality on top of Veeva’s platform focusing on commercial applications like GTM strategies and salesforce optimization.

Veeva has built the leading end-to-end software product in life sciences. They are helping make drug development global and removing a lot of operational barriers for a drug to get from the lab to the patient. Veeva is expanding into data and patient relationships and has the potential to build out pretty strong multi-side network effects.

Market

Veeva has been very useful to me as a general way to measure the number of companies in life sciences and the size of the software market in the vertical. As of the beginning of 2021, Veeva had 993 customers ranging from Gilead and Merck to Alnylam and Moderna. Across their two main product lines, Commercial Cloud and Vault, Veeva generates around $1.17M subscription revenue per customer (an outdated image below showing how much Veeva is an outlier compared to other SaaS companies). With the former having around/over 432 customers and Vault with 852. In 2014, Veeva estimated that the addressable market for their software products was around $2B for both its CRM and Vault. In 2021, the company estimated that the market size for Commercial Cloud had increased to ~$3B and Vault to $5B. Overall, well over $100B is spent per year on clinical trials.

Veeva has dominated both the CRM and content management market in life sciences (with well over 50% market share) by “[making] sure the market is clear then [figuring] out if it's correct later on.” In short, Veeva focuses on markets where there is a lot of spend. For clinical trials, it’s over $100B. Elsewhere in life sciences, over $4B is spent on pre-clinical research along with a ~$1T chemicals market and $10Bs being spent on biomanufacturing annually. Veeva’s focus on a specific vertical (life sciences) narrowed down the number of customers they can sell too - hundreds/thousands versus 10,000s for generalist SaaS companies. An important feature of the life sciences market that helped Veeva is how concentrated the market is: the top 50 companies account for over 70% of sales. The smaller market allowed Veeva to build products that embrace the complexity and regulatory requirements of drug development where others, like for example Salesforce, avoided initially to grow in less obtrusive markets.

An important part of Veeva’s success was its sales strategy. Because it had a relatively smaller number of customers to sell to, each relationship mattered a lot. The company relied on reference selling where Veeva’s key customers kickstart the sales process to other potential ones. So for a given product line, Veeva starts off with a smaller number of early adopters. Once these customers are successful, Veeva converts them into vocal advocates mainly by using their success as a case study for others. This is connected to how Veeva structures its sales team. The company has to ensure it doesn’t oversaturate their customers with sales pitches at risk of alienating their already small customer base. The overarching theme here is “engaged teams working together”:

  1. Sales are not measured quarterly at Veeva for things like bonuses. There is a focus on hiring to set up sales down-the-line. But this type of approach requires disciplined product planning, and with Peter being Swiss-American, he makes sure the trains run on time at Veeva. Historically, the company has been great at knowing where their products are on the adoption curve. For example, the CRM has over 80% penetration for biopharma whereas MyVeeva is still early. This influences how Veeva’s incentives sales and structures teams.

  2. Trusting other teams within Veeva, which is very hard for any company at scale. Their sales incentives play a big role in the organization’s cohesiveness.

  3. The sales team focuses on early adopters within a company in order to find a champion and set up their reference sales model (i.e. get early adopters and spread success stories)

  4. Having a bias toward organic growth.

  5. Spending money like it’s your own - for example, Veeva has no travel expense policy. Peter flies coach, which saves Veeva a few million dollars per year, but others aren’t forced to do the same. The founder is just leading by example.

These types of values all increase the stature of Veeva’s brand in the eyes of its customers. So in life sciences with large markets but a smaller customer base, trust and branding are way more important. Veeva is now exploring new markets beyond drug development towards  consumer goods, chemicals, and cosmetics. As the life sciences ecosystem becomes larger and more diverse (i.e. emergence of synthetic biology), more pure-play software companies in the field might become venture scale.

Business model

Veeva was one of the first vertical SaaS companies. Salesforce was the pioneer for SaaS and paved the way for the business model shift from perpetual software licenses to subscriptions. But Veeva extended this SaaS model to a specific industry, beating Salesforce to the life sciences. In the early-to-mid 2000s, most SaaS companies were focused on problems across multiple industries in order to maximize total addressable market. At the time focusing on one market seemed a little unwise. Would customers be willing to pay a premium for software? How much? Despite the small customer base for a vertical SaaS business, there are major advantages in higher sales efficiencies by going to the same customer and selling them more overtime:

  • Lower customer acquisition costs - Veeva’s sales and marketing team can tailor their story to specific customers unlike other SaaS companies. This can lead to lower costs overall and higher conversion rates. Sales and marketing represents around 20% of Veeva’s sales whereas other SaaS companies range from 30%-50%.

  • Deeper product development - Veeva can also invest more resources into building software with functionality specific to life sciences. Veeva builds software for an industry with high regulatory requirements, everything from QA to clinical data and sales engagement. Sometimes the product needs to be customized for one customer; this is where services become a major advantage. Also, vertical SaaS actually enables faster product development and allows a company to build product moats through add-ons and integrations creating up/cross-selling opportunities and increasing switching costs.

  • Knowing your customer better - having less customers to sell to allows Veeva to really get to know them versus the comparable, horizontal SaaS company. 

  • Very high capital efficiency - Veeva is an outlier in terms of LTV/CAC. With lower CAC and a stickier product (increases LTV), Veeva compensates for a smaller customer base with a highly efficient business model. Veeva’s CAC has historically been between 0.8 and 1.2 whereas other SaaS companies are often at 1.5 and up to 3. Veeva has been so successful here due to its reference selling model and focus on life sciences. For vertical SaaS companies, unit economics becomes a much more important metric than sales growth. In short, every customer counts in vertical SaaS. Unit economic metrics are invariant to time so it’s really important to think how something like CAC or LTV (and their various sub-components) scale. Hiring a salesforce scales a lot differently than spending money on Instagram ads, but both contribute to CAC. For markets that have less customers, like life sciences, CAC will quickly increase as spending goes up. As a result, branding and how your customer perceives your product is probably the most important part of any vertical SaaS company.

Vertical SaaS is risky because a company is putting all their eggs in one basket. You have to be really good at picking the right markets (i.e. clear then correct). The market chosen might slow down in growth or there could be some sort of regulatory change that makes product development harder.

SaaS has been the major shift in how software is distributed over the last decade or so. What’s next on the horizon for companies like Veeva? Developer led sales (i.e. bottom-up) led by companies like Twilio has been a major shift. Benchling has led the way to execute this strategy in life sciences. There’s a growing opportunity to combine the increasing power of fintech with life sciences. A company like Veeva does pretty well selling its CRM, content management system, among other products. However, there is a much larger market in life sciences or any industry in financial services or other types of transactions. Vertical SaaS companies are often tightly integrated enough with their customers to expand into things like equipment purchases, lending, payments, and even payroll. In life sciences, there are opportunities to build software for biobonds, services payments, drug pricing, and a lot more. Veeva’s focus on life sciences allowed the company to be much more efficient and more quickly broaden its product portfolio. Ultimately, Veeva focuses all its resources toward a homogenous customer base with the same problems and once they solve them for one, everyone else comes to Veeva and they become the standard.

Veeva used a reverse churn strategy to expand their addressable market. With hundreds of customers, the company had to figure out a way to upsell them without deteriorating their brand. The first step to have a shot at reverse churn is getting lock-in. Veeva does this with integrations, data, and services to increase switching costs for the customer. On a side note, Oracle and DB2 are the epitome of switching costs. Veeva takes a less extreme approach, but once a customer is using one of Veeva’s products, they generate data on the software platform that becomes increasingly hard to transfer over to another CRM or software vendor. But data is probably the weakest strategy because at the end of the day, a SaaS application is at its core just a relational database and a website that presents the data. Most database architectures are the same so it’s hard but not impossible for a customer to transfer data (i.e. ETL).

The modules Veeva builds on top of Commercial Cloud and Vault to create closer integrations with the customer and services that customize the product and add sunk costs through training and time spent to learn Veeva. The company is also building out products with network effects that can also increase its lock-in. For example, Site Connect helps clinical trial sites and sponsors share data. As this network grows, the platform can become the standard for any clinical trial, but this is still early days here. With a lot of customer data, Veeva is in a great position to build new networks, think LinkedIn for drug developers or an ecosystem similar to Apple’s App Store, in life sciences and forge long-lasting barriers to entry. With a smaller market, vertical SaaS requires a differentiated strategy, especially on GTM in order to win a larger share of a relatively smaller market.

Veeva uses software add-ons and services to build a powerful moat around its business. The company has a pretty high retention rate (over 120%) versus other SaaS companies due to Veeva’s ability to create new modules on top of their platform. In life sciences, this type of approach makes sense because companies’ needs evolve as they go through the life cycle of early stage discovery to the clinic and finally commercial. This gives Veeva a lot of opportunities to sell into an organization - they can sell Vault to a company that is just starting up a clinical trial or push its CRM to a commercial-stage company. For Veeva, services are a first-class citizen: their customers “like [their] products, [but] love  [their] people.” Veeva builds out software similar to Salesforce or Google, but also has built out an underlying product offering with similarities to Accenture. In short, Veeva builds great software and uses services to cross-pollinate across a company(s) and create lock-in.

Veeva might be impossible to replicate. They had great timing with SaaS just starting off when the company was founded along with the tectonic shift to the cloud as a major tailwind. The founder’s relationships with Salesforce and other companies was also an important initial advantage, and ultimately, most software companies can’t charge over $1M per customer. There are opportunities in bottom-up approaches in life sciences, merging fintech with life sciences, and using a large user base to build new, downstream applications. Most pure-play software companies in life sciences may not be venture-scale due to the smaller number of customers. Veeva is probably the exception. But there are other pathways to use software as an on-ramp to expand into products and business models.

What other opportunities are there for SaaS companies in life sciences? There’s many different types of software companies that can be built here:

  • Incumbents (to be disrupted or partnered with): ArisGlobal, Bioclinica, Medidata, PerkinElmer, IQVIA. In particular, IQVIA has a pretty expansive product offering as a result of its creation from the merger of IMS Health (clinical data) and Quintiles (a CRO) in 2016. And the company makes it pretty difficult to port datasets into other software platforms. Veeva is actually in a suit with IQVIA over this.

  • Experimental design and execution: Benchling, Synthace, TetraScience, Elemental Machines, Labstep. The beachhead is design with the long-term ability to start automating the lab.

  • Data management: LatchBio, EtaBio, ScienceIO, SevenBridges, Elucidata. Files in life sciences are pretty large. We need better tools to query and analyze them.

  • Clinical trials: Tilda Research, TrialSpark, Unlearn, Science 37, Faro Health. Some important themes here are standardizing sites with Tilda leading the way, new trial designs like synthetic controls and adaptive trials, and patient consented data to more easily scale trials.

  • Data brokers: Evidation Health, Hume AI, H1 Insights, Komodo Health, Change Healthcare, Datavant, Definitive Healthcare, and even companies like 23andMe, Omica, and 54Gene. Evidation’s recent deal with Merck highlights the power of a data platform to enable new treatments and products in life sciences. Komodo is very fascinating too because they purchase medical claims and patient data and repackage it into new products focusing on recommendations for their customers. The important part of these products are de-identifying and standardizing the data. Even large tech companies have done interesting work here with Apple’s ResearchKit to enable the use of Apple products in health studies and Verily’s Project Baseline that is creating a very large dataset on patients across time.

  • Marketplaces: BenchSci, Abcam, Science Exchange, Knowde. This type of model is a strong way to monetize transactional volume in life sciences instead of charging a subscription fee for a piece of software.

  • Clinical tools: SOPHiA Genetics (maybe the best success stories for a SaaS business in genomic analysis; I would recommend reading their F-1), which deserves its own case study, Sema4, SolveBio, Viz.ai, PathAI. Showing clinical utility is a high bar along with the long sales cycles into hospitals and healthcare networks.

Some ideas for pure-play software companies in life sciences (email us if you’re a founder or a builder interested) are:

  1. Software for biomanufacturing. Medicines and biological products are becoming increasingly similar to rocket engines and we need better software to aggregate suppliers and components for things like cell and gene therapies.

  2. Drivers to connect lab devices with software. Synthace has shown the way but this is a large opportunity to standardize data regardless of the OEM.

  3. Build a life sciences company on top of Snowflake similar to building on top of Salesforce to create high-performance computing applications for life sciences. There’s also an opportunity to build a life sciences company on top of Domino Data Lab (thanks Leon) and even a data store like Veeva.

  4. Common interfaces and user experiences especially for open source software in life sciences.

  5. New clinical trial protocol designs. Unlearn has led the way here but there are so many more opportunities.

  6. A calendar for experiments. Some sort of scheduler for the lab and even clinical trials can automate a lot of the upfront planning currently done. Veeva, Benchling, among others are doing this in some way but there might be a window for a standalone product here. Think Zapier for the life sciences lab.

Software companies in life sciences will meet at a Milvian Bridge as their product offerings increasingly overlap. Veeva shows that software in highly regulated industries requires different growth strategies. The company used the Salesforce CRM platform and prior relationships to execute a brilliant top-down strategy. There are now develop-driven strategies and increasingly fintech ones to get users. And as the physical world of the lab converges with software, simulated environments may increasingly become a standard. As a result, interoperability, software persistence, and scale across life sciences and healthcare becomes much more important in this scenario. 

Veeva is likely a major exception in life sciences. A few rules might be:

  1. Never get involved in a land war in Asia.

  2. Never go in against a Sicilian when death is on the line.

  3. Never build a standalone software company in life sciences.

But there always exists a lot of interesting opportunities in the long tail. Narrowly-targeted software and products can create very efficient businesses. In some of these niches software might be venture-scale, and even if it isn’t, it can be a user on-ramp to something larger and more meaningful.

​​Special thanks to Leon Furchtgott as well as others kind enough to review, offer ideas, and provide feedback on this piece.

Axial - Recursion Pharmaceuticals

Surveying great inventors and businesses

Axial partners with great founders and inventors. We invest in early-stage life sciences companies often when they are no more than an idea. We are fanatical about helping the rare inventor who is compelled to build their own enduring business. If you or someone you know has a great idea or company in life sciences, Axial would be excited to get to know you and possibly invest in your vision and company . We are excited to be in business with you - email us at   info@axialvc.com

Recursion is a one of the canonical AI drug discovery businesses. Founded in 2013 by Chris Gibson, Blake Borgeson, and Dean Li, the company spun out of the University of Utah to industrialize drug development. In conjunction with “The Business of AI in Life Sciences,” this case study is focused on understanding the key drivers for Recursion’s success and the long-term potential of their platform and business model.

An important theme for Recursion that shouldn’t be discounted is being in the right place at the right time. Gibson was an MD/PhD student at the University of Utah where we did his research on cerebral cavernous malformations (CCM). The only reason he moved out to Utah was due to his wife doing her medical residency in Salt Lake City. Chris then joined Dean Li’s lab who is well known for world-class medical research and had already spun off several companies in the past. Coincidentally, Chris’ college buddy Blake was already running a pretty successful Internet company while he was getting his PhD in bioinformatics. Chris and Blake met while at Rice University. The former was studying bioengineering and the latter, electrical engineering. Before starting Recursion, Blake actually founded and still runs a pretty successful online business: BuildASign.com. In no small part, this type of entrepreneurial experience was pivotal to help the company scale quickly; Blake was essential for implementing the core infrastructure that powers Recursion.

Moreover, the company was founded just in time to benefit from major breakthroughs in AI with ImageNet coming out in 2012 and TensorFlow and AlphaGo in 2015. I remember talking to Blake in early 2015. I was in Seattle interviewing for grad school at UoW, and what struck me the most was the company’s focus on perfecting image analysis. I ended up going to Berkeley for grad school. Recursion ended up scaling their platform pretty successfully. Their biological hypothesis was that unbiased models of diseases with hundreds to thousands of parameters can scalably discover new drug candidates. By relying on images, Recursion can then use the same assay for each disease and screen for drugs that make cells look more “normal.”

CCM was a perfect fit for this type of approach given that cellular models showed significant morphological changes. This motivated the initial project to screen compounds that can rescue this phenotype. With this premise, Recursion’s platform is focused on scaling the number of cellular parameters they can detect across as many models. Recursion’s disease purview is only limited by their ability to generate a screenable model. A key initial driver for the company’s success was in-licensing the compound from Chris’ CCM research while at Utah as its lead. While the drug candidate didn’t necessarily come from the platform, having an early pipeline helped with a wide-range of activities, mainly fundraising.

The company’s business model initially focused on rare, genetic diseases and drug repurposing. Recursion could more easily generate accurate cell models for rare diseases like NF2 and GM2 gangliosidosis. Over time, the company moved beyond repurposing into internal drug discovery and partnerships (i.e. repurposing often isn’t venture-scale). Now, Recursion is in the position to spinout companies and get more involved at the intersection of financial engineering and drug development.

Recursion recently went public. Congratulations to the founders and everyone on the team. This is the link to their S-1. Chris and Blake’s ability to build a founder-driven life sciences company has been inspiring. They also showed you can build a venture-backed biotech company outside a hub like San Francisco or Boston. By building their company in Salt Lake City, Utah, Chris and Blake had a near monopoly on talent. They also have a haven where they can craft a unique culture that merges biology with software: from the S-1, 40% of Recursion’s employees are biologists and chemists and 35% are software engineers and data scientists. This is pretty rare for any life sciences company.

Some key themes for Recursion have been:

  1. An intense focus on building the infrastructure to generate large amounts of unbiased biological data. This requires substantial upfront investments without immediate payoff but has set up Recursion to generate a wide range of leads and garner more partnerships down-the-line.

  2. Building truly interdisciplinary teams. This is highlighted by the company’s biology/software employee split. Blake as a founder and engineer with coding experience was always very instrumental to attract and train world-class engineering talent.

  3. Focusing on diseases (i.e. rare) where their models were biologically relevant. Rather than pursue something like Alzheimer’s or Dry AMD from day one, Recursion built technical momentum by focusing on “lower hanging” fruit problems. This has set the company up to expand their purview to oncology and a lot of other diseases since their assay can be easily extended as long as they have an unbiased model.

The last 7-8 years, Recursion had to build out the technology platform and develop a pipeline of medicines. Their clinical results will be one of the first signals of artificial intelligence’s ability to make drug development more efficient/predictive. At the very least, AI allows a company to generate a large number of drug candidates. By working on industrializing drug discovery, the company is set up to become a market leader at the intersection of financial and technical infrastructure. By generating larger amounts of leads, Recursion has the platform to one day securitize pipelines. They might end up being the first company. It’s them or BridgeBio and maybe a few others. As a result, the next decade for Recursion will be driven by their clinical progress and march toward securitization.

~2014:

2018:

Key findings

  1. An important technical moat for Recursion are their cellular models for as many diseases as possible. This is one of their major bottlenecks for the applicability of their platform. In 2014, Chris stated that “maybe 25 to 50 percent” of diseases will be amenable to Recursion’s platform. I am sure that purview has increased. 

  2. This platform gives Recursion the unique ability to design securtizable pipelines. With an excess of capital right now, there is an opportunity to securitize drug portfolios. BridgeBio is in the lead and has the best shot at introducing “biobonds.” Others like Cullinan and Centessa are emerging as well. And a key part of securitization is building a pipeline of uncorrelated assets. Recursion has the technology to generate a lot of leads themselves; others often have to go out to license or acquire assets. The company has a strong shot at then pooling these leads together thoughtful about correlations and portfolio/risk management.

  3. The morphological atlas Recursion has built out gives them a unique ability to get preclinical programs off the ground. Their disease purview is only limited by their cellular models. Long-term, the company ought to knock out every gene, using CRISPR, in the human genome and measure cellular changes across time. By making images as computable as the genome, Recursion has built a platform that can enable a unique business model.

  4. There are thousands of rare, genetic diseases that affect over 20M patients in the US. Globally, rare diseases start looking a lot less rare. For example, there are ~360K CCM patients in the US and parts of Europe. China and India have at least 1M, maybe 2M, CCM patients. Around 93% of these diseases do not have an FDA-approved treatment. Given the regulatory and clinical advantages, the market pull helped Recursion more easily translate their platform into a pipeline.

  5. The first couple years for Recursion were focused on getting the platform turned on. Five years after that the company implemented the platform to industrialize image analysis and lead discovery. This has set up Recursion to build a pretty unique business with a wide range of deal types and an ability to securitize entire drug portfolios. Just as much as Recursion is a pioneer in technically scaling drug development, the company can take the lead in financial scaling.

  6. Recursion has built out their own drug development assembly line (image below). Hopefully the company will bring new medicines to patients faster and at scale. And for their business model, they have an opportunity to bring a lot of innovation to clinical trials - how they are both executed and financed. One thing that has struck me about Recursion is how Chris always acted like the CEO of a public (or soon-to-be) company. From how he ran his board meetings (i.e. presenting ~100 page overview documents) to always expanding Recursion’s ambitions every year, Chris and the team have built one of the iconic AI drug discovery companies. Others like Insitro, Enable Medicine, and Exscientia are in the mix. The next ten years for Recursion and similar companies will be an opportunity to measure the clinical impact of AI. Given Recursion’s history, it is guaranteed that they will evolve their platform and business model to match their vision.

Technology

Recursion’s technology is focused on industrialization. So for drug development, that means automation, unbiased data, and relevant biological models. The first iteration of Recursion’s platform was based on the Cell Painting and CellProfiler tools developed by the Carpenter Lab. The former relies on a series of fluorescent dyes to measure the morphology of cells and their  internal structures like organelles. You can screen drugs against cells and measure how cell morphology changes - does it revert back to “normal?” A lot of biological information can be captured with an image, and if analysis is done correctly, new features, that humans often can’t see, can be detected to hone in on mechanism-of-action (MoA) and targets.

Images in biology are pretty powerful. Over a 1000 features can be measured from cell size, organelle shape, intensity, and a lot more to understand the effects of a drug. Done at scale, this type of work can generate a large database of cell images and their corresponding treatment. Recursion has open sourced some of this data with RxRx. Analysis of these cellular images can be helpful to cluster drugs and discover new compounds with a MoA of interest. This type of unbiased approach is particularly powerful for diseases and pathways not very well understood as long as there is some sort of genetic/morphological component. 

Recursion has done a great job at building up a world-class image analysis team. Maybe only Calico can rival their current capabilities. With the premise of using images to develop drugs, Recursion has a platform with a wide range of applications limited only by what one can image and measure. This is at least one lifetime’s worth of work. An image of a cell contains thousands, if not more, different features that can be fed into a machine learning model. This work can generate phenotypic fingerprints that can be used to infer the state of the cell. Other pieces of data in drug development and life sciences aren’t nearly as extensible. For example in chemistry, solubility and permeability data don’t generate nearly as many features. This is a core issue in AI: dimensionality of the dataset can limit the predictive power of a machine learning model. Moreover, data needs to be generated in a standardized way so it can be successfully interpreted by a model. Too much variation in public datasets or merged ones often leads teams down the wrong path. Recursion spent the capital to build up their own platform and bypass this problem. Overall, the company’s platform can be segmented into 6 major parts:

  1. ReChem - selection and design of chemical compounds

  2. ReScreen - workflow for complex screening experiments

  3. ReScreenRun - automate the screening of millions of compounds across many cellular disease models

  4. ReRun - phenotypic signatures (Recursion calls them phenoprints)

  5. ReAnalyze - measure a compound’s efficacy and toxicity

  6. RePredict - models compounds based on phenoprints, structure, and other features

The first step for Recursion’s platform is to generate a library of cellular models and drug candidates. A figure (below) from Recursion a few years ago does a good job at showing the dimensions at which their platform can grow: chemical matter, perturbations, and human cell types. From the S-1, Recursion has at least 36 different cell lines ranging from iPSC-derived and primary cells to hepatic progenitor cells, human colon adenocarcinoma, cardiomyocytes, and macrophages. Recursion started off with repurposed chemical matter because it’s a bit cheaper to acquire but now has their own internal library compounds and a larger in silico database. Perturbations can be a lot of things from genetic knockouts to cytokine treatments.

An important technical moat for Recursion are their cellular models for as many diseases as possible. This is one of their major bottlenecks for the applicability of their platform. In 2014, Chris stated that “maybe 25 to 50 percent” of diseases will be amenable to Recursion’s platform. I am sure that purview has increased. All-in-all, Recursion is scaling an old technology to a new problem. Drug discovery first started by finding chemical matter that generates a useful effect like a biochemical change or a change in physical appearance. This work focuses on screening for phenotypes (and rescues) of interest. The power of genomics and structure-based drug design has revolutionized many parts of drug discovery by flipping this process to start at a particular target or pathway; however, phenotypic screening is still useful in some contexts where the underlying biology hasn’t been fully characterized. Recursion’s unbiased approach can be useful in places where target-centric hypotheses have not worked - rescuing a disease profile is a lot more scalable than screening against many targets. On a side point, phenotypic screening could bring transformative products in consumer products among other markets.

In 2015, a team at Pfizer published a paper defining the key rules for a phenotypic screen:

  1. Selecting physiologically-relevant cell types and models. Examples are iPSCs, organoids, and even animal models. In vitro models offer higher throughput versus in vivo models that might have more accuracy. So far Recursion has focused on simpler, more scalable models, but I am sure that will change in the future.

  2. Designing assays that are relevant to a disease. This is where a substantial moat is built.

  3. Defining assay endpoints that are similar to clinical endpoints. This is focused on using biomarkers or disease signatures that are matched to clinical samples.

These types of screens can be divided into 2 major steps: (1) simple primary assay to cover as much chemical matter as possible and (2) complex, physiologically-relevant models to hone in on interesting hits. After the model is established, a large library of molecules are screened to measure something like expression changes in a panel of proteins or cellular characteristics like proliferation. Secondary assays are used mainly to counter screen and filter out molecules that have general effects. This is the part where phenotypic screening has higher upfront costs versus target-based programs. Recursion’s focus on automation gives them an important ability to reduce these costs. On top of this work, Recursion and other companies can filter hits more accurately with transcriptomics, proteomics, unsupervised machine learning, high-content imaging, and other tools within the assays themselves. Hits are then grouped into mechanism classes to prioritize them with a focus on target deconvolution. Even though many approved drugs have been approved without a known target, this knowledge is incredibly important to de-risk a program and much more accessible with the wide-array of tools: panels of known target classes are screened against, activity-based protein profiling is very useful along with compound-immobilized beads, photoaffinity labeling, and cellular thermal shift assays. Overtime, Recursion’s success will be partly driven by their ability to perfect assays that can work for a large class of human diseases.

After the experimental infrastructure is set up, Recursion can screen across many phenotypes. With a toolkit to screen a large number of compounds across many cell types and perturbations, where do you start? In 2017, Recursion ran 2.2M experiments generating 0.5 PB of data across 7 cell types using a library with 3000 compounds; the cost per experiment (CPE) was $0.63. A key metric for Recursion’s technology is this cost. In 2018, the CPE was $0.45. In 2019, it was $0.36. In 2020, it has gone down to $0.33. Over 4 years, the company drove down their platform costs by well over 40%. At a certain point, other companies might have a hard time catching up for the cell and assay types Recursion focuses on. In 2020, the company did 55.6M experiments (a ~25x increase) that generated 6.8 PB of data across 36 cell types using a library with 706K compounds (with a larger in silico library numbered in the billions).

In drug development programs, the key focus is formulating and testing a hypothesis. In target-based discovery, the process tests a hypothesis by generating a hit for a given target then going to lead selection and optimization and finally into pre-clinical/clinical testing. Whereas, phenotypic screening generates a hit for a given phenotype then the next step is to find the target. The latter process can get tricky sometimes and can make lead selection a bit more difficult, and as a result, Recursion’s platform is a better framework to make phenotypic screening as structured as target-based programs:

  • In vitro models that are relevant for more diseases. Patient-derived cell lines are a step forward here, but could lead to unexpected variance in assay conditions. Co-culturing is also another useful approach but could impact throughput.

  • With more complex assay designs, identifying important variables that influence reproducibility and disease relevance. Recursion’s automation reduces variance here.

  • “Chain of translatability,” an idea from biopharma, to connect assay endpoints to a clinical outcome

On the screening side, Recursion has many opportunities. Can they move to more complex models like an organ-on-a-chip or organoid and increase screening throughput there. Recursion can integrate their platform with proteomic tools, particularly activity-based profiling, during the target deconvolution step. This would speed up the process and lower the barrier to entry to use phenotypic screening because this step is the last and sometimes hardest. Long-term, Recursion has the potential to create a database to enable the matching of a “phenoprint” to target from a virtual screen one day. Hopefully, they will open source most of the data.

Then image analysis is done to capture various cellular features during the screen. Across thousands of structural (morphological) and functional (activity) parameters, the company’s technology measures disease-specific changes in cells and finds drugs that rescue them. From the data, Recursion asks (relatively) simple questions to hone in on drug candidates. What is the size and shape of the cells? Did the nucleus or other organelles move after the perturbation? By how much? What changes are specific to a disease? Recursion might not be able to establish cause-effect relationships for these correlations, but they can measure them at scale to screen chemical compounds.

In morphological profiling, quantitative data are extracted from microscopy images of cells to identify biologically relevant similarities and differences among samples based on these profiles. Image-based tools in drug development and life sciences in general offer a new angle to hone in one things like mechanism-of-action, biomarkers, and a lot more. With the increasing power of computer vision tools over the last 5 years or so,  morphological profiling of cells, led by Recursion, has made a larger impact on drug discovery. This work casts a wide net to discover new biomarkers, MoAs, and leads:

  1. MoA - perturbation and rescue experiments are used to to match images of treated cells with cells perturbed for a given gene. A database of images for cells knocked-out for a given gene or treated with an annotated compound enables quick search and comparison.

  2. Lead generation - 100s to 1000s of hits can be narrowed down pretty easily to a few leads based on filtering for a given MoA or target. As long as the image of the perturbed gene exists then it can be used to sort on.

  3. Biomarkers - another important part of morphological profiling is discovering and developing new biomarkers from images. This is useful in drug discovery to potentially stratify samples or patients and diagnostics. The actual image and morphological features can be used in addition to genetic expression profiles as a biomarker.

Hundreds to thousands of features can be extracted (i.e. shape, size, intensity) from a cell to measure phenotypes that may have not been detected before. These features can help validate new drug targets, group compounds together, and find new disease signatures. Excitedly this work can be done in parallel versus serial across many different targets and disease models. A key problem to solve is scaling cell culture and reducing the time for feature analysis. The actual cell culture and imaging can take a few days to weeks. Moreover, the data analysis for a lot of this work takes a few weeks as well. Can the analysis part be automated and completed in a ~day? In combination with genomics, especially spatial transcriptomics, morphological profiling can easily generate large amounts of IP in sync. This analysis component is the essential part of Recursion’s ability to generate a lot of leads - this will enable them to one day securitize their pipeline (we’ll go more into this in the business model section below).

Recursion then uses software and machine learning to analyze this dataset. With petabytes of biological images, the company has standardized data to feed into their models. There is some non-trivial backend software engineering here.  Doing this at scale with an increasing amount of data is an important moat. Storing the images, handling data streaming and processing, and having a system for ad-hoc analysis, gives Recursion an ability to quickly go from a screen to a lead. Also, I am sure most of the technical problems faced by Recursion probably don’t require sophisticated machine learning work. The software infrastructure allows machine learning to make a major impact on the long-tail of biological problems. A section from the The Business of AI in Life Sciences does a good job explaining this part:

“[AI] creates an entirely new way to do biology where experiments aren’t necessarily going to provide meaningful results but are done to more accurately train models. Biology is a large search space and can easily multiply the number of edge cases thereby increasing costs of development. AI models that take advantage of parallelization to test models, manage data inputs, and work to eliminate steps in the product development process have a shot to reduce this search space. These edge cases may never disappear given the complexity of biology and will need to be validated at the bench or the clinic.”

Over time, this platform can begin to make more accurate predictions during their drug development process. Recursion has done a great job at matching the right dataset to the problem. Most machine learning models are very sensitive to variations in data and if they are applied to non-standardized or static datasets, overfitting is likely to occur. Another key consideration, especially in biology, for using models for predictions is batch effects. Small parts of an experimental design like temperature and treatment duration can be picked up by an autoencoder and lead to incorrect conclusions. Once this hard work is complete, Recursion has the ability to transform this screening data into useful results and predictions like drug clustered by MoAs, targets, and other features (image below).

With a large enough N, Recursion can establish the definitive atlas of cell biology. This type of roadmap can help the company move from a hypothesis to a lead a lot more quickly than others. The true test of this technology is clinical success. This is still to be determined for the entire AI field, but Recursion is currently in the lead.

Recursion has made incredible progress from a small company starting off in a conference room in Salt Lake City to a sprawling labspace with some of the best robotics in life sciences. Their success has been driven by their focus, along with being in the right places at the right now, but Recursion still hasn’t scratched the surface on AI’s potential impact on drug discovery. The company ought to make more progress and begin to establish AI’s ability to bring new medicines to patients.

This platform gives Recursion the unique ability to design securtizable pipelines. With an excess of capital right now, there is an opportunity to securitize drug portfolios. BridgeBio is in the lead and has the best shot at introducing “biobonds.” Others like Cullinan and Centessa are emerging as well. And a key part of securitization is building a pipeline of uncorrelated assets. Recursion has the technology to generate a lot of leads themselves; others often have to go out to license or acquire assets. The company has a strong shot at then pooling these leads together thoughtful about correlations and portfolio/risk management.

The morphological atlas Recursion has built out gives them a unique ability to get preclinical programs off the ground. Their disease purview is only limited by their cellular models. Long-term, the company ought to knock out every gene, using CRISPR, in the human genome and measure cellular changes across time. By making images as computable as the genome, Recursion has built a platform that can enable a unique business model.

Market

The market for Recursion, like all drug companies, is pretty self-evident: cure disease and help patients. Recursion’s initial focus on rare diseases played a pivotal role in its success. An important theme in biotechnology is focusing a platform on lower-hanging fruit problems then expanding. The company’s first program was for CCM from Chris’ PhD project. Having an in-licensed pipeline helped Recurison get to the clinic sooner rather than later. A key risk for a bit was whether Recursion’s platform could generate clinical assets. Now the company has to show it can get one of their programs to approval over the next ~decade.

There are thousands of rare, genetic diseases that affect over 20M patients in the US. Globally, rare diseases start looking a lot less rare. For example, there are ~360K CCM patients in the US and parts of Europe. China and India have at least 1M, maybe 2M, CCM patients. Around 93% of these diseases do not have an FDA-approved treatment. Given the regulatory and clinical advantages, the market pull helped Recursion more easily translate their platform into a pipeline. Around the time (2015ish) when AI was making a lot of progress, Recursion made a slight pivot from focusing on rare diseases to industrialization. Their platform enables the company to move into the clinic faster (image below). By industrializing a certain set of experiments, Recursion’s market strategy has been to leapfrog and expand into larger markets.

Recursion’s pipeline reflects this market dynamic. REC-994 (CCM) is from previous work and three out of four of Recursion’s clinical assets are for rare, genetic diseases. They have expanded their pipeline to oncology and neuroscience along with moving from known chemical entities to new ones. An important consideration is developing this pipeline as efficiently as possible; this is where the business model becomes important. Beyond the technology, the company is transitioning to a late-stage clinical company and will need to update its talent base to reflect their goals of being a fully-integrated drug company. 

Business model

It seems like Recursion expands its business purview every year. On this point, it also seems like Chris slightly reinvents himself every year by learning/adopting a new skill set from AI-driven drug discovery to clinical development and BD. Versus a technology company that has two main phases - before and after product-market fit, biotech companies are multi-phasic and this requires a CEO that can reinvent themselves. Chris has done an incredible job at growing his skillset to match Recursion’s growing ambitions.

The core part of Recursion’s business model is its platform and internal pipeline. With this base, the company has increasing power to engage in creative business deals from partnerships with companies like Bayer and Sanofi Genzyme to a spinout with CereXis (to house their NF2 program). What has always surprised me was that Recursion never really scaled up their partnership side of the business. A natural comparable for the company is Millennium Pharmaceuticals. Millennium was the category leader in genomics and used that positioning to scale up partnering. Recursion is the category leader in AI and drug discovery; given that, the company could still scale up on partnerships. But capital is a lot more accessible for biotech companies versus the early 1990s so focusing too much on partnerships and maximizing non-dilutive dollars (image below) might be more of a distraction now. Given the progress Recursion has made with its platform, it might have the bargaining power now to execute 50/50 deals pioneered by companies like Regeneron. The first couple years for Recursion were focused on getting the platform turned on. Five years after that the company implemented the platform to industrialize image analysis and lead discovery. This has set up Recursion to build a pretty unique business with a wide range of deal types and an ability to securitize entire drug portfolios. Just as much as Recursion is a pioneer in technically scaling drug development, the company can take the lead in financial scaling.

Pipeline design and execution have been a key driver for Recursion’s business model. The platform can generate a lot of leads then it's on the company to filter and manage them efficiently. For example, after a few years developing their program for Ataxia Telangiectasia, Recursion ultimately had to move on from it. Overall, AI provides an advantage at the top of the drug development funnel. Then financial engineering and clinical expertise (i.e. KOLs) is the advantage at the bottom. On the financial engineering side, Recursion has current capabilities to spinout new companies and acquire external assets. On the former, Recursion spun out CereXis to drive the NF2 drug program forward. This type of deal structure is very useful to reset the cap table and realign incentives on a particular asset(s). What Recursion is doing here is very similar to BridgeBio’s work as well. Recursion can also move into joint ventures if they wanted to - focusing their platform on something like aging or infectious disease with a corporate partner.

To get to 100, if not hundreds, of drug candidates, Recursion will need financial engineering and greater access to capital markets. Their platform is validated to generate a lot of leads not only in rare diseases but in oncology, neuroscience, fibrosis, and more.  The next few years, Recursion will likely be operating at the intersection of technical and financial engineering. I am sure they are going to use a lot of the great ideas from Andrew Lo at MIT. Versus other companies with similar ambitions of securitizing portfolios of drugs one day, Recursion has the technology to develop  a pipeline in sync and generate their own IP. The playbook here is being written but some important considerations are:

Recursion has built out their own drug development assembly line (image below). Hopefully the company will bring new medicines to patients faster and at scale. And for their business model, they have an opportunity to bring a lot of innovation to clinical trials - how they are both executed and financed. One thing that has struck me about Recursion is how Chris always acted like the CEO of a public (or soon-to-be) company. From how he ran his board meetings (i.e. presenting ~100 page overview documents) to always expanding Recursion’s ambitions every year, Chris and the team have built one of the iconic AI drug discovery companies. Others like Insitro, Enable Medicine, and Exscientia are in the mix. The next ten years for Recursion and similar companies will be an opportunity to measure the clinical impact of AI. Given Recursion’s history, it is guaranteed that they will evolve their platform and business model to match their vision.

The Business of AI in Life Sciences

Life sciences strategy

Axial partners with great founders and inventors. We invest in early-stage life sciences companies often when they are no more than an idea. We are fanatical about helping the rare inventor who is compelled to build their own enduring business. If you or someone you know has a great idea or company in life sciences, Axial would be excited to get to know you and possibly invest in your vision and company . We are excited to be in business with you - email us at   info@axialvc.com

The Business of AI in Life Sciences

Artificial intelligence (AI) has the potential to transform many parts of life sciences from preclinical drug development and healthcare to synthetic biology and diagnostics. Basic research in AI has made major strides over the last 5 years, and when combined with biologists working with data as much as they work on the bench, the technology is changing how biology is studied and engineered.

As a result, AI-first companies in life sciences at first may not look like traditional companies. Whereas, traditional life sciences companies usually have a core set of IP or are based on a biological hypothesis, AI life sciences companies often look like R&D shops or services companies at first. This transition from services to a product focus has several trade offs but provides the potential to build a more scalable business model. In particular, these companies face challenges with:

  1. ​Generating large amounts of data that often needs to be unbiased, which means a lot more capital is required here than traditional companies

  2. Recruiting and training engineering talent

  3. Implementing biologically relevant models

​Generating large amounts of data that is often unbiased

First and foremost, an AI-first life sciences company needs high-quality data. In particular, this is very important in drug development where data is often not shared and stuck within the silos of each company. Diagnostics and healthcare software companies have accessible retrospective studies as long as you have the network within the medical community. And synthetic biology is in a similar position as drug development.

These types of companies often have to make large upfront investments in custom wet-lab infrastructure. There are two main reasons for this requirement right now: (1) lack of high quality public datasets that are labeled properly and removed of artifacts and (2) datasets not comprehensively designed to build accurate models. AI-first companies design custom experimental workflows, which are powered by the increasing power of robotic automation to scale wet lab work. Automation can be a major differentiator to produce high quality data for AI that cannot be generated by hand. This custom infrastructure can be pointed to iPSCs, DNA-encoded libraries, microscopy data, antibody screening, single-cell profiling, and more. This experimental data is fed into algorithms to hopefully generate new insights into everything from small molecule discovery to CHO-cell design. So many of these companies initially resemble an academic lab working to make new discoveries in biology. They need large amounts of technical investment to build the relevant datasets. This type of work is expensive and not accessible to most; as a result, new companies often are built on the premise that large amounts of data will somehow lead to a better or a higher number of products.

Recruiting and training engineering talent

The second challenge is getting software engineers and biologists to work well together. An important problem is tightly coupling computational and wet-lab work. This is driven by how well groups on both sides collaborate.

Moreover, incumbents, especially in drug development, are usually not willing to invest in machine learning talent because they don’t want to pay them more than their executives. There is a scenario over the next 1-2 decades that the best software engineers in the world easily earn the same amounts as star athletes or movie stars. The best AI-first life sciences companies do a good job at monopolizing ML talent. Insitro has done a great job here. These companies also invest resources in building sustainable tech infrastructure that can be maintained and used by biologists and data scientists.

Implementing biologically relevant models

Another important challenge is bringing the datasets and talent together to build better models of biology. These models could be used to classify certain cells, match small molecules to certain targets, or predict the behavior of a genetic circuit. If the biological data is accurately generated and an interdisciplinary team is built, these models have the potential to identify new drug targets, engineer metabolic pathways, and design better medicines. However, a major problem is that the scale of the data quickly grows to a size that a life sciences team cannot handle without the support of world-class software engineering.

Ultimately, AI life sciences companies initially appear like services companies working on software design and data generation. To experienced people in life sciences, these types of companies look more academic than commercial. However, the long-term thesis for most of these companies is to develop and commercialize their own products. Time will tell which ones succeed. As a result, building an AI-driven life sciences company is different in a few fundamental ways: data generation, talent, and models.

​Generating large amounts of data that is often unbiased

The first part, data generation, is a substantial cost for AI-driven life sciences companies. At least in biotech, this goes against recent trends to reduce initial setup costs. During the 2000s, more early-stage biotechnology companies shifted toward a virtual model to start off. This was a response to high-profile failures of companies that raised a lot of money early only to pivot later. Over the last 2 decades, most of the costs of early-stage drug development have been borne by CROs. Nimbus Therapeutics is the best case study for this trend. With the costs being pushed to the vendor, this virtualization was enabled mainly by contract research organizations (CRO) especially as sites that biopharma shut down were converted into CROs. This allowed companies not to have to make initial outlays for lab space and other infrastructure. Even places like LabCentral and MBC BioLabs allow early-stage companies to have a minimal lab footprint before validating their work.

Virtual biotechnology companies start with a biological hypothesis and work to prove it out before raising more capital. CROs are useful for commoditized experiments like ADMET, some synthesis, and some cell lines/mouse models. This trend toward virtualization might be true in other life sciences fields but is most pronounced in drug development. However, a life sciences company centered around AI often doesn't start off with an initial hypothesis. The company probably needs custom experimental formats like a cell line with a series of specific reporters or certain libraries of proteins. As a result, CROs and outsourcing in general aren’t the best options. Until CROs hire in-house machine learning engineers and build out more infrastructure, virtualization probably isn’t an option for an AI-driven company. However, the resources for an AI drug company to virtualize are emerging with various bioinformatics SaaS products out there, more accessible lab automation tools, among others out there.

An AI-driven life sciences company needs large datasets to reduce the need for sophisticated models. On limited data, custom neural networks are needed. For example, with an antibody library with limited amounts of data points, companies need really good models that can accurately interpret the data and make correct predictions. Whereas, for a database of every single point mutation for a given antibody or library, standard recurrent neural networks or other vanilla models can be used.

For an AI-life sciences companies, the initial costs to get started can be daunting:

  • In-house infrastructure - depending on the modality and models used, a company probably has to set up their own instruments and workflows

  • Custom experiments - a company will have to invest resources to design and execute experiments that are often custom. For example, an imaging workflow might have to be deployed to gather cell morphology data along with gene expression profiles and protein localization. The difficult part for most companies is getting datasets that validate across not only models but scales.

  • Rich data sources - as a result, AI-driven life sciences companies often generate large and complex datasets. The tools to analyze this data and build models can be outside the scope of current software products. Companies here have to do a good job at increasing interpretability of high performing models; in short, a high accuracy score from a model might not translate into a sound hypothesis.

  • Tooling to scale their AI models - with AWS, various AI packages, and a lot more, the ability to deploy models has become a lot more accessible. However, the costs to deploy an AI model can run up to the $100Ks. For example AlphaFold, might have cost on the order of millions of dollars to train their models. New AI-specific chips, like tensor processing units, should help here. But there is a growing divide between the computing resources needed for AI models and the power of the chips to train them.

This upfront cost to build an AI-driven life sciences company can take $10Ms, maybe even $100Ms. This is often all done without a clear biological hypothesis. Maybe a company will in-license an initial set of assets to de-risk the platform investment. In addition, it’s not clear how much long-term operations (i.e. spend on compute resources, talent) will affect a company’s gross margin. The success of these companies depends on their ability to build models that eventually replace human annotation. With the keys to success driven by picking the right problem, curating the right dataset, and interpreting the results accurately.

For drug development, the idea is that AI could put a major dent in the cost curve for early work. These companies will still need to engage in traditional product development at some point. Once an initial hit is found, a company will move into pre-clinical and clinical studies. This is a cost that is much more difficult to change.

For AI-driven life sciences companies, avoiding false positives is also incredibly important as to not invest too many resources toward a deadend. By generating more data and investing in custom resources, a company will need to create more products, have higher efficiencies, and garner non-dilutive dollars to defray the upfront investment in order to make financials work for shareholders.

Recruiting and training engineering talent

An important element of building an AI-focused life sciences company is recruiting and training engineering talent. Recruiting AI talent is a lot harder than you would expect. Then retaining them is even harder. In general software engineers are being paid like professional athletes especially those with specialized knowledge in artificial intelligence.

Talent may be the main bottleneck limiting the impact of AI on life sciences. There are 100Ks of AI engineers in the world, but the competition for them is across almost every industry. For a life sciences company, there are 3 main challenges for AI talent: (1) Getting the expertise in the door, (2) Helping engineers learn enough biology to be dangerous, and (3), Retaining the trained engineering talent given the fierce competition against the Googles and Facebooks of the world along with every quant hedge fund. Given this, what are some strategies to build talented engineering teams?:

  • Get them early - a co-founder or early employee has to be a world-class AI engineer to improve the odds of attracting more. It’s pretty hard, but not impossible, for a founding team of biologists to recruit nevertheless lead great engineers. Importantly, finding ML talent who do not need to be the star expert at everything is essential for a life sciences company. The same is true for biologists to be humble at the intersection of AI and bio as well.

  • Get an advisor - the second best option to attract talent early, is to get an advisor who has expertise in AI. In short, there is a need for more Jeff Dean’s of biology.

  • Poach entire teams - engineers at large technology companies might be open to moving into life sciences to work on more “important” problems depending on their salary/equity packages along with the potential to continue staying with co-workers. Hexagon Bio did the best job here recruiting a lot of great engineers from Palantir.

  • Train biologists to become data scientists and SWEs - the last resort is to train biologists to become engineers. This might take 3-5 years though so be patient. Within AI-driven companies, biologists also need data and programming fluency to be productive.

Implementing biologically relevant models

The last challenge for an AI-focused life sciences company is implementing biologically relevant models. Data generation and talent are prerequisites for AI models but understanding when to deploy them is another important challenge. Biology is still the limiting factor and its complexity is what makes it beautiful but hard to engineer. Even with the best design and tools, biological validation cycles are still a major consideration for all companies. Given this bottleneck, having a deep understanding of deployment phases and other features before translation is incredibly important. So what are some of the main problems for implementing models in biology?:

  • Unstructured and noisy data - collecting data from sequencing to spatial transcriptomics to health records even can create too much complexity for a given model and team to handle. Moreover, just as important as any given data source are cross-validation studies.

  • Training data coverage - making sure the initial data set and training period is robust enough to deploy into the lab. Depending on the problem addressed, this process can take weeks and even months.

  • When to deploy? - figuring out when a model is accurate enough to trust its recommendations. For example, two key metrics in classification models are ROC and AUC, which measure performance of the model at all classification thresholds and the aggregate performance across all thresholds, respectively. In diagnostics, determining when an AUC is high enough to distinguish a patient with a disease or not and deploy is tricky without validation data.

  • Time to deploy? - decreasing the time it takes from developing a model to deploying one. It’s not obvious this gets easier over time given the increasing number of edge cases with a larger data set. However, this is where in-house data generation capabilities become an advantage to manage the cost of wet lab work and hopefully make deployment less expensive.

During the process of model design, early positive signals can give a false sense of confidence. Initial hits may not translate well at a certain stage of development. An early discovery may not account for a unique edge case. For example, a MoA that works in models may not translate into the clinic. Or for something in biomanufacturing, a model that is accurate at one scale may not work at another due to difference in oxygen circulation among other external factors. The long-tail is where the majority of the work is done to implement AI models. The majority of this long tail is happening in the wet lab. This trick is training models with “unphysical” biological experiments (Jacob, thank you for the concept) to help models know the limits of the biological parameter space. This creates an entirely new way to do biology where experiments aren’t necessarily going to provide meaningful results but are done to more accurately train models. Biology is a large search space and can easily multiply the number of edge cases thereby increasing costs of development. AI models that take advantage of parallelization to test models, manage data inputs, and work to eliminate steps in the product development process have a shot to reduce this search space. These edge cases may never disappear given the complexity of biology and will need to be validated at the bench or the clinic.

The rules of building an AI-first life sciences business are still being created and written by companies like Insitro, Recursion Pharmaceuticals, and more. Three important moats are talent, scale, and product development - can a company recruit/retain the best AI talent, validate their models more efficiently, and ask the most important questions? But AI and data itself likely do not create moats in the long-run. Both become commodities - AI models are becoming commoditized especially with pre-trained versions and various open source libraries, and data can be put in the public domain and proprietary datasets can soon become commoditized as their generation become cheaper (i.e. $10M spent on sequencing today can soon only cost $1M in a few years). But models are more easily made a commodity in life sciences where focusing on the right datasets carries most of the freight for a company’s success.

There are countless opportunities to apply AI to life sciences. A useful way to categorize these is by data and models (image below; thanks Lucas). The same segmentation can also be done by talent and team building strategies. We will do a follow up analysis diving into case studies for each of these categories:

  1. Automate data generation - Insitro and Recursion focus on building out infrastructure to generate data that a group of lab techs can’t

  2. Specialize on modality - Dyno and Serotiny focus on AAVs and CARs, respectively and build focused data sets

  3. Low data methods - ProteinQure builds models that can operate on smaller scale datasets, which has advantages to optimize for many more features versus large-scale data generation approaches

  4. Unlabeled data - Deep Genomics and EQRx invest resources to structure unlabeled data to discover new targets among other things

  5. New molecular representations - Unnatural Products and Genesis Therapeutics focus on developing models and data sets to more accurately predict features for small molecules like PK/PD and ADMET

  6. Noisy data - Asimov and Hexagon Bio work with complex cellular data to design new cell lines and natural products, respectively

  7. Partners generate data - AbCellera works with partners to generate a large amount of data that can be fed back into their own AI models. Partnerships can be a good way for a business to bootstrap the initial data generation requirement.

This new batch of companies can learn at least 5 key lessons in building AI-first life sciences companies from the last generation:

Focus on specific problems

Narrowing focus to a specific task in life sciences is pretty important given how complex the whole field is. Ideally, low-hanging fruit tasks are pursued first to validate the models and build momentum on the product development side. Examples are BigHat and LabGenius working on antibodies, and even AbCellera has built out their AI team. Unnatural Products and Anagenex for certain types of small molecules. ProteinQure for peptides. Serotiny for CARs. Dyno Therapeutics for AAVs. Asimov for certain mammalian cells. All of these companies focus on specific problems whether on the modality side or a particular data set.

This enables reducing complexity of models and data

By narrowing down focus, a company can reduce the complexity of its models and data generation capabilities. Dyno can get really good at capsid engineering rather than having to build models for everything ranging from transgene expression to engulfment. Serotiny can build product moats around chimeric antigen receptor constructs without having to necessarily create models for target engagement. This reduction in complexity has a direct impact on COGS.

Recognize the high variable costs of building a life sciences business centered around AI

An AI-focused life sciences company not only has costs associated with wet lab work but costs from model implementation and compute. This is what makes these business models unique: they make an upfront investment in software and AI with the premise that more products or partnerships will come. More often than not platforms are overbuilt to prepare for new applications. Hopefully these variable costs decrease as more CRO infrastructure is put into place. However, AI will have an impact on gross margins initially driven by model maintenance and deployment; it might be useful for companies to split these variable costs from wet lab work.

Commoditize or be commoditized

We might get to the point where for example, a “drug discovered by AI” is a similar phrase as a “drug discovered by phage display”. New AI tools are still being built out. We are still pretty early in this wave of AI progress with ImageNet coming out in 2012 and TensorFlow and AlphaGo in 2015. Beyond fundamental breakthroughs, new workflows for AI models are emerging along with new models and automation of training tasks. Given all of this activity, an AI-focused life sciences company can easily fall behind on technology. It’s important to be forward thinking and constantly update your toolkit. Even the talent might get commoditized similar to genomics/bioinformatics went from a rare skill to a common one over the last 2 decades.

Use AI to build moats around products

The model itself is not a moat nor is the data. AI often allows problems in biology to be solved in a new way. Examples are old problems like drug repurposing or AAV and CAR design that needed scale to unlock new therapeutic variants. AI can help design unique products where a moat can be constructed.

AI companies in life sciences are more bio than software. Some of the advantages of AI don’t come cheaply. Companies face challenges in data generation, building interdisciplinary teams, and variable costs of developing AI models. These upfront costs create barriers from new entrants and create some defensibility. The ability to build hybrid business models merging life sciences and AI have the potential to invent faster, create new markets, and make product development more efficient. AI may lead to non-obvious side effects; for example, in drug development, more data might make it easier to find more ways to argue for an approval. AI is also causing a culture shift where companies pick the top 10 predictions from a model that may be very different from one another. Companies in life sciences could also use AI to generate a pipeline of uncorrelated assets with the goal of securitization. There are opportunities to bring AI to metabolic engineering, biomanufacturing, enzymes, and ASOs as well. AI has made a significant impact on tasks like PK/PD studies, formulations, and the initial stages of drug development, but can companies like Unlearn and Tilda bring these tools downstream to clinical trials? Beyond drug development, companies like Ginkgo and Zymergen have taken the lead on using AI in synthetic biology. Inflammatix and Endpoint Health in diagnostics. Levels and Whoop in consumer health products. EQRx in fast follower drug development. Abridge and PatientPing in healthcare.

Artificial intelligence has the potential to create a template for new products in all of life sciences. But at the very least, AI allows a business to generate new IP and execute more partnerships. For any AI-focused life sciences company, talent may be the most important moat; however, hypothesis-generation and scale are valuable as well. The enduring businesses will focus on problems that are easy for AI models avoiding edge cases and combining their platform with other business strategies. AI is a tool to build moats around biological products. By using the tool to iterate and improve, these companies have the potential to move faster than competitors and expand their product lines at an increasingly faster rate.

​​Special thanks to Rishi Bedi, Lucas Siow, Brandon White, Jacob Oppenheim as well as others kind enough to review and provide feedback on this piece.

​​

Loading more posts…