SciCore-Mol: LLM Augmented for Molecular Cognition

SciCore-Mol is an 8B parameter model developed by Tsinghua University and Peking University, designed to overcome the limitations of traditional LLMs in processing complex scientific data, particularly molecular structures. Unlike standard LLMs that struggle with the topological and geometric nature of molecules when represented as linear text, SciCore-Mol integrates specialized external cognitive modules.

Key Capabilities & Architecture

Pluggable Modules: Augments the base LLM with a GVP encoder for molecular structure encoding, a diffusion generator for molecular generation, and a numerical-sensitive Reaction Transformer for tasks like yield prediction.
Two-Stage Alignment: Utilizes a two-stage alignment mechanism where external modules are invoked via special tokens and fused at the hidden-state level, preserving the LLM's general reasoning while adding specialized molecular perception.
Three-Stage Training Pipeline: Involves independent component pre-training, cross-modal alignment training (SFT), and task-specific fine-tuning.

Specialized Use Cases

Molecular Reasoning: Excels in tasks requiring deep understanding of molecular information, such as product prediction, retrosynthesis, and yield prediction.
Drug Discovery: Applicable to drug optimization tasks, including ADMET scoring.
Scientific Research: Evaluated across various chemistry benchmarks like ChemBench4K, MMLU Chemistry subsets, ORD Reaction Prediction, and SMolInstruct molecular tasks.

Overview

SciCore-Mol: LLM Augmented for Molecular Cognition

Key Capabilities & Architecture

Specialized Use Cases

Full Model Card (README)