adamo1139/yi-34b-200k-rawrr-dpo-1

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Jan 15, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The adamo1139/yi-34b-200k-rawrr-dpo-1 is a 34 billion parameter language model based on the Yi-34B-200K architecture, fine-tuned using DPO on the rawrr_v1 dataset. This model is designed to be completion-focused with reduced refusals, serving as a raw base model for further fine-tuning rather than an instruction-following agent. It aims to mitigate contamination issues present in the base Yi-34B-200K model regarding instruct and refusal datasets.

Loading preview...

Model Overview

The adamo1139/yi-34b-200k-rawrr-dpo-1 is a 34 billion parameter model derived from the Yi-34B-200K architecture. It has been fine-tuned using DPO (Direct Preference Optimization) on the rawrr_v1 dataset, utilizing QLoRA with specific configurations (ctx 200, lora_r 4, lora_alpha 8) before merging the adapter with the base model.

Key Characteristics

  • Completion-Focused: Unlike many instruction-tuned models, this variant is primarily designed for completion tasks, aiming to generate text rather than strictly follow instructions.
  • Reduced Refusals: The rawrr_v1 dataset training specifically targets reducing model refusals, particularly for benign topics, making it more amenable to generating content without unnecessary rejections.
  • Base Model for Further Fine-tuning: This model is intended to serve as a robust, "raw" base for subsequent fine-tuning efforts, similar to the role of a raw LLaMa 65B model.
  • Mitigates Contamination: The DPO training on the rawrr dataset addresses contamination issues observed in the base Yi-34B-200K model concerning instruct and refusal datasets, aiming for a cleaner, more raw output.

Intended Use Cases

  • Foundation for Custom Fine-tuning: Ideal for developers looking for a powerful base model to fine-tune for specific, niche applications where a raw, completion-oriented model is preferred.
  • Content Generation: Suitable for tasks requiring creative or extensive text generation where instruction following is secondary to fluent and less restrictive output.

License

This model operates under the Yi-license and is restricted to non-commercial use only.