Dolphin 2.9.2 Qwen2 72B Overview
Dolphin 2.9.2 Qwen2 72B is a powerful 72.7 billion parameter language model developed by Eric Hartford, Lucas Atkins, Fernando Fernandes, and Cognitive Computations. Built upon the Qwen2-72B base, this model was fine-tuned using a full-weight approach on parameters selected by Laser Scanner, utilizing a ChatML prompt template. It supports a base context length of 128k tokens, with fine-tuning conducted at an 8k sequence length.
Key Capabilities
- Instruction Following: Excels in a wide range of instruction-based tasks.
- Conversational AI: Designed for engaging and coherent dialogue.
- Coding Skills: Possesses strong coding abilities, trained on various code-related datasets.
- Agentic Abilities & Function Calling: Includes initial capabilities for agentic workflows and supports function calling.
- Uncensored & Compliant: The model is uncensored and highly compliant with user requests, with datasets filtered to remove alignment and bias, making it adaptable for various applications (users are advised to implement their own alignment layer).
Training Details
The model was trained on a diverse set of ShareGPT-formatted datasets, including those focused on general conversations, coding (codegen, translation, feedback), mathematical reasoning (Orca-Math), and tool use (Toolbench instruction, negative, react, tflan). This comprehensive training regimen contributes to its broad skill set.
Performance Metrics
Evaluations on the Open LLM Leaderboard show an average score of 32.00, with specific results including:
- IFEval (0-Shot): 40.38
- BBH (3-Shot): 47.70
- MATH Lvl 5 (4-Shot): 21.37
- MMLU-PRO (5-shot): 49.52
Licensing
Dolphin 2.9.2 is licensed under the tongyi-qianwen license, permitting commercial use in accordance with its terms. It was trained using data generated from GPT-4 and other models.