By Rich Soll, Senior Advisor, Strategic Initiatives, WuXi AppTec (@richsollwx)

insitro’s CEO Daphne Koller is no stranger to the world of experimental biology. As the Rajeev Motwani of Professor of Computer Science at Stanford where she spent 18 years and most recently as Chief Computing Officer at Calico, Koller is taking those experiences to focus on key issues plaguing drug discovery today where machine learning (ML) may impact in choosing druggable, pertinent targets implicated in disease and increasing the chances of success. Koller brings to the table unique perspectives as one of the most influential and important persons in the world, as recognized by Newsweek, Time Magazine, and Fast Company for importance, influence and creativity respectively.

insitro’s formation was announced in 2018 with Series A financing, the size of which was only recently disclosed to be larger than $100+Million, according to Fierce Biotech. Koller is backed by some of the boldest visionaries in the life science investment community including Bob Nelson of ARCH, Jim Tananbaum of Foresite Capital, Vijay Pande of A16Z, Krishna Yeshwant of GV, and Alexis Borisy of Third Rock Ventures to answer a very simple question: Can we make biology more predictable?  Koller firmly believes that machine learning (ML) can address key problems in drug discovery and development, but that high quality data is needed to tackle this problem.

For example, take a look at non-alcoholic steatohepatitis (NASH), a disease arising from liver inflammation and fatty liver, that leads to cirrhosis, liver cancer, liver failure or cardiovascular disease. The incidence of NASH has risen dramatically over the last two decades because of the growing prevalence of obesity, insulin and lipid disorders.  It is regarded as a “silent” disease since the symptoms are not manifested in early stages, during which they progress to fibrosis and cirrhosis. There is a high risk for liver failure and liver cancer. It is estimated that 16 million Americans suffer NASH.  Liver transplantation is the only option for NASH cirrhosis.  By 2020, NASH will overtake hepatitis C as the leading cause of liver transplants in the U.S. In the 7 major markets (US, France, Germany, Italy, Spain, UK and Japan), it is estimated that the NASH market will rise from $618M in 2016 to $25.3B, an annual growth rate 45%/yr, a staggering number according to GlobalData.

NASH is rapidly becoming a highly crowded field with at least several companies in heavy competition to reach approval: Gilead (NASDAQ GILD) with a number of experimental therapies in clinical trials, 89Bio, Viking Therapeutics, Madrigal (NASDAQ MDGL), and Intercept Pharmaceuticals (NASDAQ ICPT), who is seeking approval this year from US and European regulators for its oral, once-a-day version of obeticholic acid.

That more predictable biology is needed was clearly felt by Gilead in February 2019, whose lead molecule, the ASK1 inhibitor selonsertib, in the late stage STELLAR-4 trial, failed to meet its primary endpoint of greater than 1-stage histologic improvement in fibrosis.

Then, in April 2019, Gilead announced the kickoff of a three year research collaboration with insitro ($15M upfront plus $35M in short-term milestones) to create disease models for NASH in search of therapeutics that could reverse or at least slow the progression of the disease.  If certain goals are achieved, insitro collects up to $1 billion in addition.

The approach being developed by insitro is to take human data that have genetics, molecular phenotypes and clinical phenotypes, and align those with the in vitro assays being developed at insitro. The human data can be obtained from publicly-available sources like the UK BioBank or, in this particular case, from clinical trials.

“We are excited about this deal for multiple reasons,” noted Koller. “First, the Gilead team are top-quality scientists and they care deeply about helping patients. Second, they bring extensive data into the collaboration, as well as outstanding capabilities in chemistry. These assets are very complementary to our ability to generate large amounts of in vitro data that align with the clinical data, and use machine learning to identify targets. Finally, this deal will provide us with short term capital to build the platform and to bring it to where we want it to be. But longer term, we hope to realize revenues associated with drugs that are beneficial to patients.”

Koller acknowledges the increasing numbers of companies in the ML space but, “the overwhelming majority of companies look to extract insights from data that’s already collected. Our perspective is that ML is primarily only as good as the data you feed it. A lot of data sets out there are not great and furthermore, even if they were fine for the purpose they were generated for, they are not designed to readily accommodate application of advanced machine learning methods.”

insitro’s approach is not to ask what data exist, but rather, define what problems are key obstacles in drug development, and then address where ML could potentially make a difference if the right data set existed at scale. “When we’re able to define key problems, which I think we have in this particular context, we can then move to generate high quality data at scale for ML analysis.”

The insitro team is starting in biology, since they believe it’s the highest leverage opportunity. “Most drugs fail because they’re targeting the wrong thing; we need better disease modeling and target ID. We’ll move into hit-to-lead and lead optimization to support mechanistic studies and help de-risk the chemistry to a certain extent. Over the course of time, we foresee applications of our technology to the design of chemical matter, identification of biomarkers, design of clinical trials, and improvements of manufacturing processes.”

Koller believes there are many places in the drug development process where you wish you had a crystal ball that would tell you, at least with some level of accuracy, the outcome of an experiment without the need to actually run it.  “In some cases, the experiment is too difficult or expensive. In other cases, it’s impossible to do. In a sense, our disease modeling and target identification effort is an attempt to predict the effect on human phenotype if a particular gene was perturbed. Better PK models and toxicity models are examples where heuristic approaches are currently used but where ML could be more impactful.”

Over the next couple of years, insitro intends to ingest the Gilead data and use ML to pull out signatures that will allow insitro to design the assays that need to be put together.  That will consume most of 2019, then in 2020 they plan to scale the platform to enable them to identify the disease signatures of NASH. By 2021, they hope to run screens that will help identify targets that will take cells from disease state to normal state when altered.

Koller comments, “I’ve been in the life science space now for nearly 20 years.  Despite the progress we’ve made over that time, I felt there were two key ingredients missing: quality data for ML applications and availability of people who are genuinely bilingual in terms of ML and life science; the jargon is very different, the way of thinking is different and the communication is very challenging. To do meaningful work at this intersection, we need teams who understand both the real problems that need to be solved and the capabilities of the technology. These teams need to work in equal partnership, hand-in-hand, at every step of the process. Specific technologies come and go, but the culture of the company and the ability of people of such diverse skill sets to communicate in this way will be here for a long haul.”

Gracefully, Koller closes, “We’re in a very early stage, there’s a lot that we need to improve upon. I prefer to avoid hype and big promises, because ultimately, the proof of this technology will be the extent to which we are able to help patients. Did we choose the right target? Did we develop a drug faster and cheaper? What is the value to the patient? I hope in the future we’ll be able to answer these questions in a positive way.”