jondurbin/nontoxic-bagel-34b-v0.2

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Dec 31, 2023License:yi-licenseArchitecture:Transformer0.0K Cold

jondurbin/nontoxic-bagel-34b-v0.2 is an experimental 34 billion parameter language model fine-tuned from Yi-34B-200K using the 'bagel' framework. This version incorporates a subset of DPO and is designed to be more censored compared to its less censored counterpart, bagel-dpo-34b-v0.2. It leverages a diverse range of supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) datasets, including those for reasoning, coding, reading comprehension, and chat, making it suitable for general-purpose conversational AI with a focus on controlled output.

Loading preview...

Overview

jondurbin/nontoxic-bagel-34b-v0.2 is a 34 billion parameter experimental language model, fine-tuned from the Yi-34B-200K base model. It utilizes the 'bagel' framework for its fine-tuning process, incorporating a diverse set of supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) datasets. This particular version is noted for being more censored compared to its counterpart, bagel-dpo-34b-v0.2.

Key Capabilities

  • Broad Task Proficiency: Trained on a wide array of datasets covering reasoning (ai2_arc, mmlu), coding (apps, python_alpaca, rosetta_code), reading comprehension (belebele, drop, squad_v2), and general instruction following (airoboros, natural_instructions).
  • Diverse Data Integration: Incorporates data from various sources like lmsys_chat_1m (GPT-4 items), mathinstruct, and roleplay-oriented datasets (bluemoon, cinematika, pippa) to enhance versatility.
  • Preference Alignment: Utilizes DPO with datasets such as helpsteer (human-annotated correctness), orca_dpo_pairs, and ultrafeedback to align model responses with desired preferences, specifically focusing on a more censored output profile.
  • Flexible Prompt Formatting: Supports multiple prompt formats including Alpaca, Vicuna, ChatML (modified), and Llama-2 chat, with each instruction converted into every format during training to improve generalization.

Good For

  • Developers seeking a 34B parameter model for general conversational AI tasks.
  • Applications requiring a model with a more controlled and censored output behavior.
  • Experimentation with models fine-tuned using a comprehensive and varied dataset approach.