Schreiber-mistral-nemo-12B Overview
Schreiber-mistral-nemo-12B is a 12 billion parameter language model developed by nbeerbower, building upon the nbeerbower/mistral-nemo-kartoffel-12B base. This model has been extensively fine-tuned using the ORPO (Odds Ratio Preference Optimization) method over three epochs, utilizing a single RTX A6000 GPU. Its training incorporated a rich array of DPO (Direct Preference Optimization) datasets, primarily sourced from literary works and synthetic fiction.
Key Capabilities
- Creative Text Generation: Optimized for producing high-quality, coherent, and engaging fictional narratives.
- Literary Style Adaptation: Benefits from fine-tuning on diverse literary datasets, including various Gutenberg collections and synthetic fiction, enabling it to generate text with nuanced stylistic elements.
- Preference-Based Learning: Leverages ORPO tuning to align model outputs with preferred stylistic and content characteristics, enhancing the quality of generated text.
Training Details
The model's fine-tuning involved a QLoRA configuration for efficient memory usage and a specific ORPO configuration with a learning rate of 8e-6, cosine scheduler, and a beta of 0.1. It was trained with a max_length of 4096 tokens and max_prompt_length of 1024, allowing for substantial context handling during generation.