Introduction
There's a moment that stays with me from my time at Roche. A senior medicinal chemist with 15 years of experience spent three days trying to find out if anyone had tested a particular kinase inhibitor against a specific inflammatory pathway. Not three days in the lab. Three days searching.
The information existed somewhere in the literature. But finding it required manually cross-referencing multiple databases, reading through hundreds of papers, and hoping you hadn't missed something critical.
This wasn't an isolated incident. It was Tuesday.
The Search Problem Nobody Talks About
Researchers spend substantial portions of their time not doing research, but searching for information about research that's already been done.
During my doctoral studies at Princeton University and throughout a decade of experience in the pharmaceutical industry, I observed this pattern recur repeatedly. Highly intelligent individuals with considerable expertise devoted a significant portion of their professional time to conducting extensive literature reviews. This was not due to any deficiency in their abilities, but rather because the available tools were not designed to accommodate the way chemists conceptualize and approach their work.
How Chemists Actually Think
Chemists think in structures, not words.
When you're trying to understand if a molecule might work, your mental model is visual. You see the benzene ring, the functional groups, the stereochemistry. The structure is the information.
But when you search, you have to translate that into text. Keywords. Boolean operators. You hope the papers used your terminology. You get thousands of results and start filtering.
It's like searching for images by describing them in words.
The Manual Curation Bottleneck
SciFinder employs approximately 800 chemists to manually read papers, extract information, and tag it. They've built an impressive resource: 165 million substances, literature back to 1907.
But human curation can't scale with literature growth. It introduces latency. And it's expensive, which is why access costs tens of thousands per year, pricing out individual researchers.
More importantly, manual curation struggles with non-obvious connections. A molecule studied for cancer in one paper, a target in autoimmune disease in another, a mechanism linking them in a third. These connections exist, but they're hard to find.
What Failure Rates Tell Us
90% of drug candidates fail during clinical development. The primary reason? Lack of efficacy (40-50%), followed by toxicity (30%), poor drug-like properties (10-15%), and commercial issues (10%).
This isn't just a biology problem. It's an information problem. When making target selection decisions, you're working with incomplete information because comprehensive searches are impractical. You do the best review you can, but you know you're missing things.
Better decisions require better information access. Better information access requires better search.
The Fasudil Example
Fasudil is a Rho kinase inhibitor developed for cerebral vasospasm. Years later, researchers discovered it showed promise for spinal cord injuries through the same mechanism.
Same molecule, different disease, both connections in the literature the whole time. But connecting them required someone familiar with both neurovascular and spinal injury literature.
How many of these connections exist right now that nobody's found? The chemical literature contains 120 million described molecules. Only 60,000 have become drugs. That ratio suggests enormous untapped potential.
Why I Started Molekula
After a decade of watching this, incremental improvements wouldn't solve the problem. The issue isn't better keyword search. It's the entire paradigm of text-based search for structural information.
What if you could draw the structure (or upload an image, or paste SMILES) and ask questions in natural language? "What synthesis routes exist for this?" "Show me similar molecules with better properties." "Has this been tested against kinases?"
What if the system understood chemistry structurally and could find connections that would take humans months to identify?
That's Molekula. An AI synthesis assistant that understands molecular structures as primary input and reasons across the entire chemistry literature.
Starting With Synthesis Planning
We're beginning with the most concrete problem: synthesis route design.
Every chemistry lab faces this daily. You have a target molecule. You need to know how to make it. What routes are viable? What starting materials are available? What's been tried?
This is well-defined with clear success criteria. Either the route works or it doesn't. Fast feedback means rapid AI improvement.
But synthesis planning is just the foundation. The same technology that finds synthesis routes can find molecule-target relationships, disease-target associations, and unclaimed therapeutic opportunities.
The goal isn't replacing chemists. It's eliminating the data archaeology so they can focus on what requires human expertise: designing experiments, interpreting results, making strategic decisions.
The Path Forward
We're launching beta in February. The first 100-200 graduate students get lifetime free access.
We're starting with students because they're doing real research with real constraints. They don't have pharma budgets. They need answers today. And importantly, they become tomorrow's pharmaceutical researchers.
Pricing will be accessible: ChatGPT pricing for specialized chemistry knowledge, not institutional database pricing. AI economics are different. Marginal cost per user is essentially zero.
Beyond synthesis planning, we're building toward comprehensive chemistry knowledge search, automated property prediction, IP-free molecular design, and eventually full automated knowledge inference for drug discovery.
What Changes
Chemistry research starts to look different.
You don't spend Tuesday searching for whether someone tested a compound. You ask and get an answer in seconds with citations.
You don't manually design routes hoping you haven't missed better approaches. You see multiple routes with reasoning, backed by extracted knowledge.
You don't miss non-obvious connections spanning different fields. The system finds them systematically.
The time spent on literature searches gets redirected to actual research. Clinical failure rates improve because decisions use more complete information. Drug discovery accelerates because the bottleneck shifts from information access back to experimental validation.
That's what we're building. Not incremental improvements to 1990s architecture, but rethinking chemistry search from first principles with AI that understands structures.
Anatoly Cherkinsky, PhD Founder, Molekula.ai Princeton Chemistry PhD | 10 Years Pharma
Interested in beta access? Visit molekula.ai or reach out on LinkedIn. We're looking for graduate students and early-career researchers in chemistry.