allura-org/TQ2.5-14B-Sugarquill-v1
allura-org/TQ2.5-14B-Sugarquill-v1 is a 14 billion parameter language model, a continued pretrain of SuperNova-Medius by Auri, specifically fine-tuned for creative writing and story generation. With a 32768 token context length, it excels at producing longer narrative content and is optimized for both roleplay and storywriting in chat or raw completion modes. This model diversifies prose quality and maintains strong instruction following capabilities, making it suitable for various narrative tasks.
Loading preview...
Model Overview
allura-org/TQ2.5-14B-Sugarquill-v1 is a 14 billion parameter model developed by Auri, built upon the SuperNova-Medius base. This model has undergone continued pretraining on a diverse dataset of short stories, aiming to enhance its prose generation capabilities and narrative versatility. It features a substantial 32768 token context length, enabling it to handle longer creative writing tasks beyond typical short stories.
Key Capabilities
- Advanced Story Generation: Fine-tuned on assorted short story data, it produces nuanced and diversified prose.
- Extended Context Handling: Supports a 32768 token context, ideal for generating longer narratives and maintaining coherence over extended interactions.
- Dual-Mode Functionality: Effective for both roleplay (RP) and general storywriting, functioning well in chat-based co-writing and raw completion scenarios.
- Robust Instruction Following: Despite its creative focus, the model retains strong instruction adherence, making it adaptable to specific user prompts.
Training Details
The model was trained for 2 epochs on 10,000 rows (approximately 18.7 million tokens) from the Erebus-87k and r_shortstories_24k datasets. Training involved normalizing punctuation to ASCII and standardizing whitespaces to improve text quality. It utilized rsLoRA with an effective batch size of 40 and a paged_ademamix_8bit optimizer, running on a 5x3090Ti workstation.
Recommended Usage
Users should employ ChatML instruct formatting, consistent with its base model. Recommended sampling parameters include a Temperature of 0.8, Min-P of 0.05, Top-A of 0.3, and a Repetition Penalty of 1.03 for stable and creative outputs.