PKU-ML/G1-7B
PKU-ML/G1-7B is a 7.62 billion parameter causal language model, based on the Qwen2.5-Instruct architecture, specifically trained and fine-tuned using Group Relative Policy Optimization (GRPO) for advanced graph reasoning tasks. It demonstrates significant improvements on graph reasoning benchmarks like Erdős, while maintaining strong generalization to unseen graph tasks and preserving general reasoning abilities. This model is optimized for complex graph-related problem-solving and analysis.
Loading preview...
G1-7B: Graph Reasoning LLM
G1-7B is a 7.62 billion parameter large language model developed by PKU-ML, built upon the Qwen2.5-Instruct architecture. It is specifically designed and optimized for solving complex graph reasoning tasks, leveraging Group Relative Policy Optimization (GRPO) during its training process, following an initial supervised fine-tuning step.
Key Capabilities
- Exceptional Graph Reasoning: Achieves up to 46% improvement over baselines on the Erdos benchmark, with the 7B variant matching OpenAI’s o3-mini and the 3B model surpassing Qwen2.5-72B-Instruct on these tasks.
- Strong Generalization: Demonstrates zero-shot generalization to novel graph tasks, improving performance on other graph reasoning benchmarks (GraphWiz, GraphArena) and real-world graphs (Cora, PubMed).
- Preserves General Reasoning: Crucially, G1-7B maintains its general reasoning capabilities across standard benchmarks such as GSM8K, MATH, and MMLU-Pro, ensuring versatility beyond its specialized graph focus.
- Architecture: Utilizes the Qwen2.5-Instruct architecture, trained with SFT & RL stages.
- Context Length: Supports a full context length of 32,768 tokens for input and 8,192 tokens for generation.
Good for
- Applications requiring advanced graph analysis and reasoning.
- Research and development in graph neural networks and LLM integration for graph problems.
- Tasks involving complex relational data and structural inference.