dphn/dolphin-2.9.3-Yi-1.5-34B-32k

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Jun 23, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Dolphin 2.9.3 Yi 1.5 34B 32k is a 34 billion parameter instruction-tuned language model developed by Eric Hartford, Lucas Atkins, Fernando Fernandes, and Cognitive Computations. Based on the Yi-1.5-34B-32k architecture, it features a 32k context length and is fine-tuned for instruction following, conversational tasks, and coding skills. This model also incorporates initial agentic abilities and supports function calling, making it suitable for compliant, uncensored applications requiring diverse capabilities.

Loading preview...

Dolphin 2.9.3 Yi 1.5 34B 32k Overview

Dolphin 2.9.3 is a 34 billion parameter instruction-tuned language model, a collaborative effort by Eric Hartford, Lucas Atkins, Fernando Fernandes, and Cognitive Computations. It is built upon the 01-ai/Yi-1.5-34B-32k base model, inheriting its 32k context length, though fine-tuning was conducted with an 8192 sequence length. The model utilizes the ChatML prompt template format.

Key Capabilities

  • Instruction Following: Excels in understanding and executing diverse instructions.
  • Conversational AI: Designed for engaging and coherent dialogue.
  • Coding Skills: Possesses capabilities for code generation and understanding.
  • Agentic Abilities: Includes initial support for agent-like behaviors.
  • Function Calling: Supports integration with external tools via function calls.
  • Uncensored Output: The model is uncensored, with its dataset filtered to remove alignment and bias, offering high compliance to user requests. Users are advised to implement their own alignment layers for responsible deployment.

Training Details

The model was fine-tuned using QLoRA with axolotl on a diverse set of datasets, including ShareGPT variations focused on system chat, multilingual content, and coding (coder-translate, coder-codegen, Code-Feedback). Other datasets like Orca-Math and various agent/tool-related datasets (agent_instruct_react, toolbench) were also used, indicating a broad training scope for its varied capabilities.