OpenDFM/RetroDFM-R-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 12, 2026License:gpl-3.0Architecture:Transformer0.0K Open Weights Cold

OpenDFM/RetroDFM-R-8B is an 8 billion parameter reasoning-driven large language model developed by OpenDFM, specifically designed for chemical retrosynthesis prediction. Unlike traditional models, it integrates large-scale reinforcement learning with chemically verifiable rewards, leading to enhanced generalization, reliability, and interpretability. This model excels at reconstructing multistep chemical routes and provides explicit, human-interpretable reasoning for its predictions, outperforming existing state-of-the-art approaches on standard benchmarks.

Loading preview...

RetroDFM-R: Reasoning-Driven Retrosynthesis Prediction

RetroDFM-R-8B is an 8 billion parameter large language model developed by OpenDFM, specifically engineered for chemical retrosynthesis. It distinguishes itself from traditional graph-based or sequence models by employing a reasoning-driven approach, significantly enhancing its predictive capabilities and interpretability.

Key Capabilities & Features

  • Reinforcement Learning Integration: Utilizes large-scale reinforcement learning with chemically verifiable rewards, leading to stronger generalization and higher prediction reliability.
  • Enhanced Interpretability: Provides explicit reasoning processes, offering clear, human-interpretable insights into retrosynthesis planning.
  • Superior Performance: Outperforms existing state-of-the-art approaches on standard benchmarks, confirmed by comprehensive evaluations.
  • Chemical Plausibility: Double-blind human assessments validate the chemical plausibility and practical usefulness of its predictions.
  • Multistep Route Reconstruction: Successfully reconstructs complex multistep routes for real drug molecules and materials reported in scientific literature.

Training Methodology

RetroDFM-R is trained via a three-stage pipeline:

  1. Continual Pretraining: Focused on retrosynthesis-specific chemical data.
  2. Supervised Fine-tuning: Utilizes distilled chain-of-thought reasoning samples.
  3. Reinforcement Learning: Further refines step-by-step reasoning and prediction quality.

Use Cases

This model is ideal for researchers and chemists requiring reliable and interpretable predictions for chemical retrosynthesis, particularly for designing synthetic routes for drug molecules and complex materials. Its ability to provide step-by-step reasoning makes it a valuable tool for understanding and validating synthetic pathways.