dotvignesh/perry-7b
dotvignesh/perry-7b is a 7 billion parameter LLaMA 2 based language model developed by dotvignesh, specifically fine-tuned to enhance reasoning capabilities. It was trained using synthetic chain-of-thought (CoT) traces over STEM data, allowing it to generalize effectively across reasoning benchmarks. This model excels in step-by-step problem-solving, particularly in mathematical and scientific domains, demonstrating notable improvements on benchmarks like GSM8K and MMLU.
Loading preview...
Perry-7B: A Reasoning-Focused LLM
Perry-7B is a 7 billion parameter language model developed by dotvignesh, based on the LLaMA 2 architecture. This model was a research project initiated in September 2023, focusing on improving reasoning capabilities through a novel training approach.
Key Capabilities & Training
The core innovation behind Perry-7B is its training methodology: it utilizes synthetic Chain-of-Thought (CoT) traces generated from STEM (Science, Technology, Engineering, Mathematics) problems. This process teaches the model to perform step-by-step reasoning, leading to stronger generalization across various reasoning benchmarks. The model was trained using compute-efficient methods.
Performance Highlights
As of September 2023, Perry-7B demonstrated significant improvements over the base LLaMA 2 7B model on key reasoning benchmarks:
- MMLU (5-shot): +2.38 points (46.18 vs 43.80)
- TruthfulQA (0-shot): +1.10 points (40.08 vs 38.98)
- GSM8K (5-shot): +4.93 points (10.31 vs 5.38)
These results indicate Perry-7B's enhanced ability in complex problem-solving and factual accuracy, particularly in quantitative and scientific reasoning tasks.
Ideal Use Cases
Perry-7B is particularly well-suited for applications requiring:
- Mathematical problem-solving: Its strong performance on GSM8K suggests proficiency in arithmetic and logical reasoning.
- Scientific and technical question answering: The STEM-focused training data makes it effective for understanding and generating responses in these domains.
- Tasks benefiting from step-by-step reasoning: Any application where a clear, logical thought process is beneficial will leverage this model's strengths.