thomas-yanxin/XinYuan-Qwen2.5-7B-0917
XinYuan-Qwen2.5-7B-0917 is a 7.6 billion parameter language model developed by thomas-yanxin, based on the Qwen2.5 architecture. This model is primarily designed to validate the impact of high-quality, meticulously extracted SFT data on model performance. It demonstrates strong capabilities across English, Chinese, Math, and Code benchmarks, including 73.72 on MMLU, 81.02 on C-EVAL, 82.94 on GSM8K, and 83.99 on HumanEval.
Loading preview...
XinYuan-Qwen2.5-7B-0917: Data Quality Validation Model
XinYuan-Qwen2.5-7B-0917 is a 7.6 billion parameter model built upon the Qwen2.5 architecture, developed by thomas-yanxin. Its core purpose is to demonstrate the significant impact of high-quality, meticulously governed SFT (Supervised Fine-Tuning) data on model performance. The developers emphasize that superior data quality alone can lead to substantial improvements in model results, even without complex training methodologies.
Key Capabilities & Performance
This model exhibits robust performance across a diverse set of benchmarks, indicating strong general-purpose capabilities:
- English Language Understanding: Achieves 73.72 on MMLU, 33.04 on GPQA, 67.55 on BBH, and 91.19 on ARC-C.
- Chinese Language Understanding: Scores 81.02 on C-EVAL and 80.06 on CMMLU.
- Mathematical Reasoning: Demonstrates proficiency with 82.94 on GSM8K and 41.06 on MATH.
- Code Generation: Performs well on coding tasks, scoring 50.6 on MBPP and 83.99 on HumanEval.
- Instruction Following: Achieves 40.48 on IFEval (Prompt Strict-Acc.).
Good For
- Research into Data Governance: Ideal for researchers and developers interested in the impact of data quality on LLM performance.
- General-Purpose Applications: Suitable for tasks requiring strong performance in English and Chinese language understanding, mathematical problem-solving, and code generation.
- Benchmarking: Can serve as a strong baseline model for evaluating the effectiveness of different SFT datasets and methodologies.