Eurus-7B-KTO: An LLM Optimized for Reasoning
Eurus-7B-KTO is a 7 billion parameter language model from OpenBMB, fine-tuned using Kahneman-Tversky Optimization (KTO) on the UltraInteract and UltraFeedback datasets. This model builds upon Eurus-7B-SFT and is specifically designed to enhance reasoning abilities, particularly in mathematical problem-solving and code generation.
Key Capabilities & Performance
- Superior Reasoning: Achieves best-in-class performance among open-source models of similar sizes, often outperforming specialized models in their respective domains.
- Efficiency: Notably, Eurus-7B-KTO demonstrates performance comparable to or better than models up to 5 times larger, and even surpasses GPT-3.5 Turbo in some evaluations.
- Enhanced Multi-turn Interaction: Preference learning with UltraInteract significantly improves its ability to handle complex, multi-turn conversations and tasks.
- Specialized Prompting: Utilizes tailored prompt formats for coding and math (CoT and PoT) to maximize performance in these areas.
Use Cases
- Mathematical Problem Solving: Excels at step-by-step math problems, supporting both Chain-of-Thought (CoT) and Program-of-Thought (PoT) approaches with Python interpreter integration.
- Code Generation: Highly effective for generating Python code based on given instructions.
- Complex Reasoning Tasks: Suitable for applications requiring robust logical deduction and multi-step thinking.
For more details, refer to the original paper and the Eurus Collection.