G1-7B: Graph Reasoning LLM
G1-7B is a 7.62 billion parameter large language model developed by PKU-ML, built upon the Qwen2.5-Instruct architecture. It is specifically designed and optimized for solving complex graph reasoning tasks, leveraging Group Relative Policy Optimization (GRPO) during its training process, following an initial supervised fine-tuning step.
Key Capabilities
- Exceptional Graph Reasoning: Achieves up to 46% improvement over baselines on the Erdos benchmark, with the 7B variant matching OpenAI’s o3-mini and the 3B model surpassing Qwen2.5-72B-Instruct on these tasks.
- Strong Generalization: Demonstrates zero-shot generalization to novel graph tasks, improving performance on other graph reasoning benchmarks (GraphWiz, GraphArena) and real-world graphs (Cora, PubMed).
- Preserves General Reasoning: Crucially, G1-7B maintains its general reasoning capabilities across standard benchmarks such as GSM8K, MATH, and MMLU-Pro, ensuring versatility beyond its specialized graph focus.
- Architecture: Utilizes the Qwen2.5-Instruct architecture, trained with SFT & RL stages.
- Context Length: Supports a full context length of 32,768 tokens for input and 8,192 tokens for generation.
Good for
- Applications requiring advanced graph analysis and reasoning.
- Research and development in graph neural networks and LLM integration for graph problems.
- Tasks involving complex relational data and structural inference.