jondurbin/bagel-34b-v0.2

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Dec 31, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

jondurbin/bagel-34b-v0.2 is a 34 billion parameter experimental fine-tune of the Yi-34B-200K model, developed by jondurbin. This version represents the Supervised Fine-Tuning (SFT) phase, prior to DPO application, and is specifically optimized for creative writing and roleplay tasks. It features a 32,768 token context length and was trained using a unique multi-prompt format approach to enhance generalization across various instruction types.

Loading preview...

Overview of jondurbin/bagel-34b-v0.2

jondurbin/bagel-34b-v0.2 is an experimental 34 billion parameter language model, fine-tuned from the Yi-34B-200K base model. This particular release is the result of the Supervised Fine-Tuning (SFT) phase, intentionally released before DPO (Direct Preference Optimization) was applied. The developer notes that this SFT-only version is likely better suited for creative writing and roleplay scenarios, distinguishing it from models optimized purely for benchmark performance.

Key Capabilities & Training Insights

  • Creative Generation Focus: Optimized for tasks like creative writing and roleplay due to its SFT-only training approach.
  • Diverse Data Sources: Trained on a wide array of datasets including ai2_arc, airoboros, apps, belebele, bluemoon (roleplay data), cinematika (RP-style data from movie scripts), lmsys_chat_1m, mathinstruct, mmlu, natural_instructions, python_alpaca, rosetta_code, slimorca, spider, and synthia. Only training splits were used, with decontamination via approximate nearest neighbor search.
  • Multi-Prompt Formatting: Employs a unique training methodology where each instruction is converted into four different prompt formats (Alpaca, Vicuna, ChatML-ish, and Llama-2 chat). This strategy aims to improve the model's generalization across various instruction styles and reduce reliance on a single format.
  • Context Length: Supports a substantial context window of 32,768 tokens.

Good For

  • Creative Writing: Generating imaginative text, stories, and descriptive content.
  • Roleplay Scenarios: Engaging in character-driven conversations and interactive narratives.
  • Exploration of SFT-only Models: Users interested in the characteristics and performance of models before DPO application, particularly for less benchmark-driven applications.