jondurbin/bagel-dpo-1.1b-v0.3
jondurbin/bagel-dpo-1.1b-v0.3 is an experimental 1.1 billion parameter language model developed by jondurbin, fine-tuned from TinyLlama with a 2048 token context length. This model explores the impact of diverse, multi-format instruction tuning using the 'bagel' framework, incorporating a wide array of datasets from reasoning to roleplay. It is primarily an experimental model for understanding instruction tuning on a small base, with the developer noting it is "basically unusable" due to the base model's limitations.
Loading preview...
Overview of jondurbin/bagel-dpo-1.1b-v0.3
This model, developed by jondurbin, is an experimental 1.1 billion parameter language model fine-tuned from TinyLlama. It leverages the 'bagel' framework for instruction tuning, aiming to explore the effects of training on a highly diverse set of data sources and prompt formats. The developer explicitly states that the model is "basically unusable" due to the limitations of its TinyLlama base.
Key Characteristics & Training
- Diverse Data Sources: Trained on a wide array of datasets including
ai2_arc(reasoning),airoboros(synthetic instructions),apps(Python coding),belebele(multilingual reading comprehension),cinematika(RP-style data),lmsys_chat_1m(GPT-4 chats),mathinstruct,mmlu, andslimorca(GPT-4 verified chats). Only train splits were used, with decontamination via cosine similarity. - Multi-Format Prompting: Each instruction was converted into four different prompt formats (Alpaca, Vicuna, ChatML-ish, Llama-2 chat) and used during training. This approach aimed to improve generalization across various instruction types.
- Context Length: Supports a context length of 2048 tokens.
Limitations and Usage Considerations
- Experimental Nature: The model is primarily for experimental purposes, with the developer noting its limited practical usability.
- Licensing: While the base TinyLlama model is Apache-2.0, the fine-tuning data includes content generated by OpenAI's GPT-4. Users should exercise caution and seek legal advice regarding commercial viability, as the implications of OpenAI's ToS on derivative models are complex and not definitively settled.