adamo1139/Yi-34B-200K-AEZAKMI-RAW-2901
The adamo1139/Yi-34B-200K-AEZAKMI-RAW-2901 is an experimental 34 billion parameter Yi-34B base model, fine-tuned by adamo1139, designed for chat applications. It leverages a 200K context window and is specifically optimized to reduce typical RLHF-induced refusals and generic language, aiming for a more uncensored and 'cozy' chatbot experience. This model is distinct for its DPO and SFT training on custom datasets (RAWrr v1 and AEZAKMI v2) to achieve a refusal-free conversational style.
Loading preview...
Model Overview
The adamo1139/Yi-34B-200K-AEZAKMI-RAW-2901 is an experimental 34 billion parameter chat model built upon the Yi-34B 200K base model. It underwent a two-stage fine-tuning process: initial DPO (Direct Preference Optimization) on the RAWrr v1 dataset, followed by SFT (Supervised Fine-Tuning) on the AEZAKMI v2 dataset. This training methodology, utilizing unsloth, aims to produce a chatbot with significantly reduced refusal behavior and less of the generic, 'RLHFed' language often found in other models.
Key Characteristics
- Refusal-Free Chat: Specifically trained to minimize refusals and avoid phrases like "It's important to remember!", offering a more direct and uncensored conversational experience.
- Custom Fine-tuning: Utilizes unique RAWrr v1 and AEZAKMI v2 datasets for DPO and SFT, respectively, differentiating its conversational style.
- ChatML Format: Optimized for the ChatML prompt format, with flexibility for different system messages.
- Experimental Nature: The model is noted as experimental and not final, with ongoing development to address known issues.
Intended Use Cases
- Uncensored Chatbots: Ideal for applications requiring a conversational agent that avoids typical AI safety guardrails and generic responses.
- Cozy Chat Experiences: Aims to provide a more natural and less constrained interaction, suitable for personal or informal chatbot projects.
Limitations and Recommendations
- Not for Math or Riddles: The model is not designed for complex mathematical reasoning or riddle-solving.
- Repetition Penalty: Users are advised to set a repetition penalty around 1.05 to mitigate potential repetition.
- Contextual Refusal Bias: The strongest anti-refusal bias is observed at the beginning of a conversation (0 context), though it persists throughout.
- Licensing: Use is subject to the Apache-2.0 license, with a note regarding potential commercial use limitations due to the
no-robotsdataset used in RAWrr_v1.