newsbang/Homer-v1.0-Qwen2.5-7B
Homer-v1.0-Qwen2.5-7B is a 7.6 billion parameter language model developed by newsbang, fine-tuned from the Qwen2.5-7B architecture. This model is specifically optimized using a large instruction-based dataset, with a notable focus on mathematical reasoning. It features a substantial context length of 131072 tokens, making it suitable for tasks requiring extensive contextual understanding and complex problem-solving, particularly in mathematics.
Loading preview...
Homer-v1.0-Qwen2.5-7B: Instruction-Tuned for Enhanced Reasoning
Homer-v1.0-Qwen2.5-7B is a 7.6 billion parameter language model, fine-tuned by newsbang from the Qwen2.5-7B base model. This iteration leverages a significant volume of instruction-based data to enhance its capabilities, particularly in complex reasoning tasks. A key aspect of its development includes a focus on mathematical problem-solving, with newsbang releasing a dedicated math subset of their training dataset and an analysis of data leakage in existing open-source math benchmarks.
Key Capabilities & Performance
This model demonstrates competitive performance across various benchmarks, as evaluated on the Open LLM Leaderboard:
- IFEval (0-Shot): Achieves 63.93% strict accuracy, indicating strong instruction following.
- BBH (3-Shot): Scores 37.81% normalized accuracy on the Big-Bench Hard suite, reflecting its reasoning abilities.
- MATH Lvl 5 (4-Shot): Attains 30.36% exact match accuracy, highlighting its specialized performance in advanced mathematics.
- MMLU-PRO (5-shot): Reaches 39.27% accuracy on the MMLU-PRO benchmark, showcasing general knowledge and reasoning.
With a substantial context window of 131072 tokens, Homer-v1.0-Qwen2.5-7B is well-suited for applications requiring deep contextual understanding and the processing of lengthy inputs.
When to Use This Model
- Mathematical Reasoning: Its fine-tuning and performance on MATH Lvl 5 suggest strong capabilities for solving complex mathematical problems.
- Instruction Following: The high IFEval score indicates proficiency in adhering to detailed instructions.
- Long Context Tasks: The extensive 131072-token context length makes it ideal for processing and generating content based on very long documents or conversations.