PKU-ML/G1-3B
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:May 31, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

PKU-ML/G1-3B is a 3.09 billion parameter causal language model, based on the Qwen2.5-Instruct architecture, developed by PKU-ML. It is specifically fine-tuned using Group Relative Policy Optimization (GRPO) for reinforcement learning to excel at graph reasoning tasks. This model demonstrates significant improvements on graph reasoning benchmarks like Erdős and generalizes well to unseen graph tasks, while preserving general reasoning abilities.

Loading preview...