May 07, 2026 ChainGPT

Synthegy: LLM "Oracle" Ranks Chemical Synthesis Paths — Composable, Open, and Fast

Synthegy: LLM "Oracle" Ranks Chemical Synthesis Paths — Composable, Open, and Fast
Designing a molecule from scratch is one of chemistry’s toughest engineering problems: it’s not just which atoms to join, but the sequence of reactions, when to shield fragile groups, and how to avoid dead ends that can waste months. A team at EPFL wants to outsource that strategic know-how to a language model. This week in Matter, Philippe Schwaller and colleagues unveiled Synthegy, a framework that uses large language models (LLMs) as reasoning engines for synthesis planning. The twist: instead of asking AI to invent molecules, Synthegy asks AI to judge and rank the candidate synthesis routes that conventional retrosynthesis software already proposes. How it works - A chemist types a plain-English constraint (for example, “form the pyrimidine ring in the early stages”). - Retrosynthesis tools generate dozens or hundreds of possible routes by breaking the target molecule into simpler building blocks. - Synthegy translates each route into text and hands those descriptions to an LLM, which scores every route against the chemist’s instruction and produces human-readable explanations. The result: the most strategically sensible routes float to the top, plus a rationale for each choice. “When making tools for chemists, the user interface matters a lot, and previous tools relied on cumbersome filters and rules,” says lead author Andres M. Bran. Validation and benchmarks Synthegy was validated in a double-blind study with 36 independent chemists who reviewed 368 route pairs. The chemists chose the same routes Synthegy ranked highest 71.2% of the time — roughly on par with inter-expert agreement. Senior researchers matched the system more often than PhD students, suggesting the model captures high-level strategic instincts that come with experience. The team trialed multiple models, including GPT-4o, Claude, and the open-source DeepSeek-r1. Gemini-2.5-pro topped the benchmark, while DeepSeek-r1 emerged as a strong local, open alternative. Synthegy is deliberately modular: any retrosynthesis engine can sit on the backend, and any capable LLM can power the reasoning layer. Beyond route ranking: mechanism checking Synthegy also tackles reaction mechanism elucidation. It breaks reactions into elementary electron-movement steps and asks the LLM to assess each step’s chemical plausibility. On straightforward reaction types like nucleophilic substitutions, top models achieved near-perfect accuracy. Practical numbers and caveats - Speed/cost: evaluating ~60 candidate routes takes about 12 minutes and costs roughly $2–3 in API fees. - Limits: LLMs sometimes misread reaction direction from text, leading to wrong feasibility calls; small models perform like random guessing; and routes longer than ~20 steps become hard to track coherently. Open science and implications The code and benchmarks are publicly available at github.com/schwallergroup/steer. With its composable design and explainable outputs, Synthegy could accelerate drug discovery, materials design, and industrial process optimization — essentially serving as an “oracle” that ranks feasible chemical paths and explains why, reducing trial-and-error in the lab. For a field where intuition often lives only in senior scientists’ heads, that’s a practical power-up. Read more AI-generated news on: undefined/news