jondurbin/bagel-34b-v0.2
jondurbin/bagel-34b-v0.2 is a 34 billion parameter experimental fine-tune of the Yi-34B-200K model, developed by jondurbin. This version represents the Supervised Fine-Tuning (SFT) phase, prior to DPO application, and is specifically optimized for creative writing and roleplay tasks. It features a 32,768 token context length and was trained using a unique multi-prompt format approach to enhance generalization across various instruction types.
Loading preview...
Overview of jondurbin/bagel-34b-v0.2
jondurbin/bagel-34b-v0.2 is an experimental 34 billion parameter language model, fine-tuned from the Yi-34B-200K base model. This particular release is the result of the Supervised Fine-Tuning (SFT) phase, intentionally released before DPO (Direct Preference Optimization) was applied. The developer notes that this SFT-only version is likely better suited for creative writing and roleplay scenarios, distinguishing it from models optimized purely for benchmark performance.
Key Capabilities & Training Insights
- Creative Generation Focus: Optimized for tasks like creative writing and roleplay due to its SFT-only training approach.
- Diverse Data Sources: Trained on a wide array of datasets including
ai2_arc,airoboros,apps,belebele,bluemoon(roleplay data),cinematika(RP-style data from movie scripts),lmsys_chat_1m,mathinstruct,mmlu,natural_instructions,python_alpaca,rosetta_code,slimorca,spider, andsynthia. Only training splits were used, with decontamination via approximate nearest neighbor search. - Multi-Prompt Formatting: Employs a unique training methodology where each instruction is converted into four different prompt formats (Alpaca, Vicuna, ChatML-ish, and Llama-2 chat). This strategy aims to improve the model's generalization across various instruction styles and reduce reliance on a single format.
- Context Length: Supports a substantial context window of 32,768 tokens.
Good For
- Creative Writing: Generating imaginative text, stories, and descriptive content.
- Roleplay Scenarios: Engaging in character-driven conversations and interactive narratives.
- Exploration of SFT-only Models: Users interested in the characteristics and performance of models before DPO application, particularly for less benchmark-driven applications.