Molekula and the Google Life Sciences Ecosystem

A Project That Almost Disappeared

A few years ago, a team inside Google was working on a problem that sits at the heart of drug discovery: teaching machines to recognize and reason about chemical structures. The project was led by Stephen Boyer, and it was serious work. Then, like many internal research initiatives, it was shut down for budgetary reasons before it could reach its potential.

The gap it left behind didn't close. The problem was still there.

I had been working on the same problem from the outside, first with AlfaChem and then with Molekula, building AI systems that could understand chemistry the way a trained chemist does: not by retrieving keyword matches but by reasoning about molecular structure, reaction mechanisms, and synthesis strategy. When the opportunity came to step into the space that Boyer's project had vacated, and to do it within Google's own ecosystem, I took it.

What the Partnership Actually Is

The relationship between Molekula and Google is worth describing accurately, because the word "partnership" covers a lot of ground in the startup world.

We are part of Google Cloud's life sciences developer community: a network of startups and researchers co-building with Google's infrastructure, working alongside Google's engineers, and accessing Google's tools and compute credits as active participants in the ecosystem rather than outside users. We are not in Google's incubator and have not yet applied to Google Ventures, though both are on our radar. What we have is something more immediate: a direct technical relationship, shared infrastructure, and a seat at the table where Google's life science AI strategy is being shaped.

The model we are training is TxGemma, Google DeepMind's open-source language model suite specifically designed for therapeutic development. TxGemma is built on Gemma 2 and fine-tuned on the Therapeutics Data Commons, a curated collection of drug discovery datasets spanning small molecules, proteins, and clinical outcomes. It already outperforms general-purpose models on the majority of therapeutic AI tasks. Our work is extending it into organic chemistry synthesis reasoning, the area where Boyer's original project pointed and where Molekula's domain expertise is deepest.

The People in the Room

One of the quieter benefits of being inside this community is the quality of the conversations it makes possible. At a recent conference at Google's San Francisco headquarters, I had the chance to speak with Nora and with Kavita, a former leader of life sciences at Google DeepMind who has since moved into a role at a biology company working on some of the most interesting questions in AI-assisted scientific reasoning. These are not superficial networking conversations. They are the kind of exchanges where people who have been thinking about the same hard problems for years can skip the preamble and get to what actually matters.

Stephen Boyer himself remains a connection I value. The work he started inside Google and the work we are doing now are not unrelated. They are, in some sense, a continuation.

Why This Matters for Chemistry

TxGemma represents something genuinely new in AI for drug discovery: a model purpose-built for the domain rather than a general model fine-tuned on a few chemistry datasets. The distinction matters because drug discovery is not a general-purpose problem. It requires understanding the specific language of molecular interaction, the constraints of synthetic accessibility, the unpredictability of biological systems. General models learn to sound fluent in chemistry. Domain-specific models learn to think in it.

What Molekula brings to this collaboration is exactly what a language model alone cannot provide: deep knowledge of how expert chemists actually reason, developed over years of building AlfaChem and, before that, two decades of working in pharmaceutical and materials chemistry at the bench. Training TxGemma on synthesis planning and chemical reasoning is not a data problem. It is a knowledge representation problem. You need people who understand both the chemistry and the AI to make the data mean what it should mean.

That is the work we are doing inside Google's ecosystem right now.

What Comes Next

The Google Life Sciences community is expanding. The resources available to participants, from compute credits to research partnerships to engineering collaboration, are significant. And the problems being worked on, drug target identification, synthesis route planning, toxicity prediction, clinical outcome modeling, are among the most consequential in applied science.

Molekula's position inside this ecosystem is not incidental to our mission. It is a direct expression of it. We started with the insight that chemistry knowledge has always been limited not by the quality of what chemists know but by the mechanisms available for sharing and applying that knowledge. Building on TxGemma, within Google's infrastructure, alongside some of the best minds working at the intersection of AI and life sciences, is what that mission looks like at scale.

We are not done with the work Boyer started. We are, in a meaningful sense, continuing it.

Anatoly Chlenov, PhD is the founder of Molekula.ai. Beta access is available at molekula.ai.