Overview
lars1234/Mistral-Small-24B-Instruct-2501-writer is a 24 billion parameter language model, fine-tuned from mistralai/Mistral-Small-24B-Instruct-2501 specifically for creative writing. It leverages Direct Preference Optimization (DPO) to enhance its narrative generation capabilities.
Key Capabilities & Performance
This model significantly improves upon its base model in various creative writing aspects, as evidenced by evaluations on the lars1234/story_writing_benchmark dataset. It shows notable gains in:
- Character Motivation: Achieves 49.8% compared to 44.6% for the base model.
- Sentence Variety: Scores 64.4% versus 57.7% for the base model.
- Avoiding Clichés: Improves to 33.3% from 24.6%.
- Natural Dialogue: Reaches 51.9% against 42.9%.
- Reader Interest: Scores 63.1% compared to 54.1%.
Overall, it achieves an average score of 56.5% on the benchmark, outperforming the base Mistral model (49.3%) across all metrics. The fine-tuning process involved creating a DPO dataset from the benchmark, focusing on language correctness and quality-based preferences (grammar, avoiding tropes, character depth, reader interest).
Training Methodology
The model was fine-tuned using Axolotl with LoRA (r=16, alpha=32), a DPO Beta of 0.1, and a learning rate of 1e-4. Training was conducted for 1 epoch with 4-bit quantization and a sequence length of 2048. Inference parameters were optimized, with a temperature of 0.75 identified as providing the most significant quality improvement.
Good for
- Generating creative stories and narratives.
- Applications requiring improved character development and dialogue.
- Tasks benefiting from enhanced reader engagement and reduced clichés in generated text.