jondurbin/bagel-dpo-1.1b-v0.3

TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

jondurbin/bagel-dpo-1.1b-v0.3 is an experimental 1.1 billion parameter language model developed by jondurbin, fine-tuned from TinyLlama with a 2048 token context length. This model explores the impact of diverse, multi-format instruction tuning using the 'bagel' framework, incorporating a wide array of datasets from reasoning to roleplay. It is primarily an experimental model for understanding instruction tuning on a small base, with the developer noting it is "basically unusable" due to the base model's limitations.

Loading preview...

Overview of jondurbin/bagel-dpo-1.1b-v0.3

This model, developed by jondurbin, is an experimental 1.1 billion parameter language model fine-tuned from TinyLlama. It leverages the 'bagel' framework for instruction tuning, aiming to explore the effects of training on a highly diverse set of data sources and prompt formats. The developer explicitly states that the model is "basically unusable" due to the limitations of its TinyLlama base.

Key Characteristics & Training

  • Diverse Data Sources: Trained on a wide array of datasets including ai2_arc (reasoning), airoboros (synthetic instructions), apps (Python coding), belebele (multilingual reading comprehension), cinematika (RP-style data), lmsys_chat_1m (GPT-4 chats), mathinstruct, mmlu, and slimorca (GPT-4 verified chats). Only train splits were used, with decontamination via cosine similarity.
  • Multi-Format Prompting: Each instruction was converted into four different prompt formats (Alpaca, Vicuna, ChatML-ish, Llama-2 chat) and used during training. This approach aimed to improve generalization across various instruction types.
  • Context Length: Supports a context length of 2048 tokens.

Limitations and Usage Considerations

  • Experimental Nature: The model is primarily for experimental purposes, with the developer noting its limited practical usability.
  • Licensing: While the base TinyLlama model is Apache-2.0, the fine-tuning data includes content generated by OpenAI's GPT-4. Users should exercise caution and seek legal advice regarding commercial viability, as the implications of OpenAI's ToS on derivative models are complex and not definitively settled.