jondurbin/airoboros-7b-gpt4-1.2
The jondurbin/airoboros-7b-gpt4-1.2 is a 7 billion parameter LLaMA-based model fine-tuned by jondurbin using QLoRA with entirely synthetic training data generated by GPT-4. This iteration expands on previous versions with thousands of new training examples, focusing on coding, math/reasoning, role-playing, and trivia. It introduces a "PLAINFORMAT" option for coding prompts to produce unformatted code output, making it particularly useful for developers requiring clean code generation.
Loading preview...
jondurbin/airoboros-7b-gpt4-1.2 Overview
This model is a 7 billion parameter LLaMA-based model fine-tuned by jondurbin using QLoRA. Its training data is entirely synthetic, generated by GPT-4, and represents an extension of the airoboros-7b-gpt4-1.1 version with significant additions.
Key Capabilities & Updates
- Enhanced Coding: Includes thousands of new instruction/response pairs for coding, with a notable feature allowing "PLAINFORMAT" at the end of prompts to generate code without markdown formatting or explanations.
- Improved Reasoning & Math: Incorporates thousands of ORCA-style reasoning and math questions, utilizing ELI5 instruction/response pairs for generating detailed answers.
- Expanded Role-Playing: Several hundred new role-playing data points have been added to enhance conversational and character-based interactions.
- Diverse Training Data: The fine-tuning dataset covers a broad range of topics including trivia, multiple-choice, fill-in-the-blank, context-obedient question answering, and theory of mind.
Usage & Licensing
The model is intended for research use only due to its reliance on the LLaMA base model's research license and the use of OpenAI-generated data, which restricts commercial use. Users can interact with the model via a modified FastChat CLI, or utilize quantized versions provided by TheBloke.