adamo1139/yi-34b-200k-rawrr-dpo-2

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Jan 25, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The adamo1139/yi-34b-200k-rawrr-dpo-2 is a 34 billion parameter Yi-34B-200K model fine-tuned using DPO on the rawrr_v1 dataset, featuring a 32768 token context length. This model is specifically designed to exhibit significantly reduced refusal behavior and a completion-focused output, making it less instruction-centric than its predecessor. It serves as a robust base model for further fine-tuning, aiming to provide uncensored and less "GPT-slop" outputs.

Loading preview...

Model Overview

The adamo1139/yi-34b-200k-rawrr-dpo-2 is a 34 billion parameter language model based on the Yi-34B-200K architecture. It has been fine-tuned using DPO (Direct Preference Optimization) on the rawrr_v1 dataset, with QLoRA at a context length of 500, lora_r 16, and lora_alpha 16. The adapter was then applied to the base model.

Key Differentiators

  • Reduced Refusal: This model demonstrates significantly stronger anti-refusal and anti-instruct capabilities compared to yi-34b-200k-rawrr-dpo-1, especially for benign topics.
  • Completion-Focused: Unlike many instruction-tuned models, this version is completion-focused rather than instruction-focused, aiming to mitigate contamination from instruct and refusal datasets that affected the base Yi-34B-200K.
  • "Raw" Output: The fine-tuning process on the rawrr_v1 dataset is intended to make the model more "raw," providing outputs with less "GPT-slop" and good 0-context uncensoredness.

Intended Use

This model is primarily intended as a base model for further fine-tuning. Developers looking to create custom instruction-tuned models that exhibit less refusal and a more direct, uncensored output style should consider this model as a starting point. It is likened to a "raw" LLaMa 65B in its design philosophy, emphasizing its utility as a foundation rather than a ready-to-use instruction follower.