dfurman/Qwen2-72B-Orpo-v0.1
dfurman/Qwen2-72B-Orpo-v0.1 is a 72.7 billion parameter language model, fine-tuned from Qwen/Qwen2-72B-Instruct using the mlabonne/orpo-dpo-mix-40k dataset. This model is designed as a generalist for diverse text generation tasks, including agentic capabilities, roleplaying, reasoning, and multi-turn conversations, with a 32K context length. It aims for long context coherence and broad applicability across various generative AI use cases.
Loading preview...
Model Overview
dfurman/Qwen2-72B-Orpo-v0.1 is a 72.7 billion parameter language model, fine-tuned from the robust Qwen/Qwen2-72B-Instruct base model. The fine-tuning process utilized 1.5k rows from the mlabonne/orpo-dpo-mix-40k dataset, enhancing its capabilities as a generalist language model.
Key Capabilities
- Generalist Text Generation: Designed to handle a wide array of text generation tasks.
- Agentic Support: Includes features that support agentic behaviors and interactions.
- Roleplaying: Capable of engaging in roleplaying scenarios.
- Reasoning: Demonstrates reasoning abilities for complex problem-solving.
- Multi-turn Conversations: Maintains coherence and context across extended dialogues.
- Long Context Coherence: Optimized to manage and understand information over its 32K token context window.
Performance Highlights
Evaluations on the Open LLM Leaderboard show an average score of 43.32. Specific metric scores include:
- IFEval (0-Shot): 78.80
- BBH (3-Shot): 57.41
- MATH Lvl 5 (4-Shot): 35.42
- MMLU-PRO (5-shot): 49.50
When to Use This Model
This model is suitable for developers seeking a versatile large language model for applications requiring strong general text generation, conversational AI, and reasoning. Its fine-tuning for agentic and roleplaying capabilities makes it particularly useful for interactive and dynamic AI systems.