TheBloke/airoboros-13B-HF

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:May 23, 2023License:otherArchitecture:Transformer0.0K Cold

TheBloke/airoboros-13B-HF is a 13 billion parameter LLaMA-based language model fine-tuned by Jon Durbin using completely synthetic training data. This model is optimized for general instruction following and demonstrates strong performance in various tasks, including math, coding, and question-answering, as evaluated by GPT-4 judging. It is provided in a 16-bit floating-point (fp16) format for efficient storage and usage.

Loading preview...

Overview

The TheBloke/airoboros-13B-HF model is a 13 billion parameter LLaMA-based language model, fine-tuned by Jon Durbin. Its distinguishing feature is the use of entirely synthetic training data, generated using a 'jailbreak' prompt with ChatGPT to create a diverse dataset, including content that might typically be censored. This approach aimed to test the capabilities of ChatGPT when unfiltered.

Key Capabilities

  • Strong Instruction Following: The model is fine-tuned for general instruction adherence.
  • Competitive Performance: Evaluated against other models using GPT-4 judging, airoboros-13B achieved a GPT-3.5 adjusted score of 98.087, performing comparably to GPT-3.5 and outperforming several other 13B and 30B models in the evaluation set.
  • Synthetic Data Training: The training data was generated synthetically, with additional passes to improve performance in areas like math, extrapolation, closed question-answering (addressing hallucination), and coding.

Training Details

The model was fine-tuned using the FastChat module on 8x NVIDIA A100 GPUs over approximately 40 hours. The training process involved an initial set of synthetic instructions, followed by a second fine-tuning pass specifically targeting math, coding, and question-answering improvements. The prompt format is compatible with FastChat/Vicuna, using a USER: [prompt] <\s> ASSISTANT: structure.