Dolphin 2.9.1 Llama 3 70b Overview
Dolphin 2.9.1 is a 70 billion parameter instruction-tuned model based on Meta's Llama 3 architecture, developed by Eric Hartford, Lucas Atkins, Fernando Fernandes, and Cognitive Computations. This iteration specifically addresses and rectifies behavioral issues identified in the previous 2.9 version, such as over-reliance on system prompts and insufficient generation length, by carefully curating the training dataset to exclude 'Systemchat' and 'Ultrachat'.
Key Capabilities
- Instruction Following: Excels at understanding and executing complex instructions.
- Conversational Skills: Designed for engaging and coherent dialogue.
- Coding Abilities: Possesses strong capabilities in code generation and understanding.
- Agentic Abilities: Includes initial support for agent-like behaviors.
- Function Calling: Supports function calling for integration with external tools.
- Uncensored Nature: The model is uncensored, with alignment and bias filtered from the dataset to enhance compliance with user requests. Users are advised to implement their own alignment layers for responsible deployment.
Training Details
The model was fine-tuned using a full-weight approach on an 8x H100 node over 3 days, utilizing a 4k sequence length. It was trained with parameters selected by Laser Scanner and employs the ChatML prompt template format. The training dataset included various ShareGPT-formatted datasets focusing on general conversation, coding, and agentic tasks, with problematic datasets removed to improve model behavior.
Good for
- Developers seeking a highly compliant and uncensored Llama 3-70b based model.
- Applications requiring advanced instruction following, conversation, and coding.
- Use cases where initial agentic abilities and function calling are beneficial.
- Environments where custom alignment layers can be implemented to manage the model's uncensored nature.