adamo1139/yi-34b-200k-rawrr-dpo-1
The adamo1139/yi-34b-200k-rawrr-dpo-1 is a 34 billion parameter language model based on the Yi-34B-200K architecture, fine-tuned using DPO on the rawrr_v1 dataset. This model is designed to be completion-focused with reduced refusals, serving as a raw base model for further fine-tuning rather than an instruction-following agent. It aims to mitigate contamination issues present in the base Yi-34B-200K model regarding instruct and refusal datasets.
Loading preview...
Model Overview
The adamo1139/yi-34b-200k-rawrr-dpo-1 is a 34 billion parameter model derived from the Yi-34B-200K architecture. It has been fine-tuned using DPO (Direct Preference Optimization) on the rawrr_v1 dataset, utilizing QLoRA with specific configurations (ctx 200, lora_r 4, lora_alpha 8) before merging the adapter with the base model.
Key Characteristics
- Completion-Focused: Unlike many instruction-tuned models, this variant is primarily designed for completion tasks, aiming to generate text rather than strictly follow instructions.
- Reduced Refusals: The
rawrr_v1dataset training specifically targets reducing model refusals, particularly for benign topics, making it more amenable to generating content without unnecessary rejections. - Base Model for Further Fine-tuning: This model is intended to serve as a robust, "raw" base for subsequent fine-tuning efforts, similar to the role of a raw LLaMa 65B model.
- Mitigates Contamination: The DPO training on the
rawrrdataset addresses contamination issues observed in the base Yi-34B-200K model concerning instruct and refusal datasets, aiming for a cleaner, more raw output.
Intended Use Cases
- Foundation for Custom Fine-tuning: Ideal for developers looking for a powerful base model to fine-tune for specific, niche applications where a raw, completion-oriented model is preferred.
- Content Generation: Suitable for tasks requiring creative or extensive text generation where instruction following is secondary to fluent and less restrictive output.
License
This model operates under the Yi-license and is restricted to non-commercial use only.