FinaPolat/RAISED_Mistral-Nemo_DPO
FinaPolat/RAISED_Mistral-Nemo_DPO is a 12 billion parameter Mistral-based language model developed by FinaPolat, fine-tuned using Direct Preference Optimization (DPO). This model was trained for efficiency, leveraging Unsloth and Huggingface's TRL library for accelerated finetuning. It is designed for general language generation tasks, building upon the FinaPolat/RAISED_Mistral-Nemo_SFT base model.
Loading preview...
Model Overview
FinaPolat/RAISED_Mistral-Nemo_DPO is a 12 billion parameter language model developed by FinaPolat. It is a finetuned variant of the Mistral architecture, specifically building upon the FinaPolat/RAISED_Mistral-Nemo_SFT model.
Key Characteristics
- Architecture: Based on the Mistral model family.
- Parameter Count: 12 billion parameters.
- Training Method: Utilizes Direct Preference Optimization (DPO) for finetuning.
- Efficiency: Finetuning was accelerated using Unsloth and Huggingface's TRL library, enabling 2x faster training.
- Context Length: Supports a context length of 32768 tokens.
Intended Use
This model is suitable for a variety of general language generation and understanding tasks, benefiting from its DPO finetuning and efficient training methodology. Its Mistral base and substantial parameter count suggest capabilities for complex reasoning and coherent text generation.