dphn/dolphin-2.2-yi-34b-200k
Dolphin-2.2-Yi-34b-200k is a 34 billion parameter language model developed by dphn, based on the Yi architecture with a 200k context window. Fine-tuned with a 16k context, it emphasizes conversation and empathy, incorporating curated Samantha and WizardLM data for enhanced multi-turn dialogue. This uncensored model is highly compliant to user requests, making it suitable for applications requiring flexible and responsive AI interactions.
Loading preview...
Dolphin-2.2-Yi-34b-200k Overview
Dolphin-2.2-Yi-34b-200k is a 34 billion parameter language model, a fine-tuned iteration of the Yi base model, developed by dphn with sponsorship from Convai. While the base model supports a 200k context, this version was fine-tuned using a 16k context window. A key focus of this 2.2 release is enhancing conversation and empathy, achieved through an infusion of curated Samantha and WizardLM datasets, specifically targeting long, multi-turn interactions and personal advice.
Key Characteristics
- Enhanced Conversational Ability: Improved at multi-turn dialogues and providing empathetic responses.
- Uncensored and Compliant: The model is uncensored, with its dataset filtered to remove alignment and bias, making it highly compliant to user requests, including potentially unethical ones. Users are advised to implement their own alignment layers.
- Dataset Composition: Built upon an open-source implementation of Microsoft's Orca, with modifications for uncensoring, deduping, and quality. It also integrates Jon Durbin's Airoboros dataset for increased creativity.
- Training: Trained for 3 epochs over 3 days on 4x A100s using qLoRA and Axolotl.
- Prompt Format: Utilizes the ChatML prompt format.
Performance Highlights
Evaluations on the Open LLM Leaderboard show an average score of 46.67, with notable scores in HellaSwag (68.18) and Winogrande (64.56). MMLU (5-Shot) achieved 55.47, while GSM8k (5-Shot) scored 3.71.
Ideal Use Cases
This model is well-suited for applications requiring highly compliant, flexible AI assistants capable of engaging in extended, empathetic, and creative multi-turn conversations, particularly where custom alignment layers can be implemented by the developer.