Zhihu-ai/Zhi-writing-dsr1-14b is a 14 billion parameter language model developed by Zhihu-ai, fine-tuned from DeepSeek-R1-Distill-Qwen-14B. Optimized for creative writing, it demonstrates improved performance on benchmarks like LLM Creative Story-Writing Benchmark and WritingBench. With a 32768 token context length, it also shows modest gains in knowledge, reasoning, and mathematical tasks, making it suitable for general-purpose applications requiring enhanced creative text generation.
Loading preview...
Model Overview
Zhihu-ai/Zhi-writing-dsr1-14b is a 14 billion parameter model, fine-tuned from DeepSeek-R1-Distill-Qwen-14B, specifically engineered to enhance creative writing capabilities. The model leverages a comprehensive training process involving rigorously filtered open-source datasets, chain-of-thought reasoning corpora, and high-quality content from Zhihu, all processed through a Reward Model (RM) filtering pipeline. Training utilized a curriculum learning strategy for Supervised Fine-tuning (SFT) and Step-DPO with DPOP for Direct Preference Optimization (DPO).
Key Capabilities & Performance
- Enhanced Creative Writing: Achieved a score of 8.33 on the LLM Creative Story-Writing Benchmark (up from 7.87) and 8.46 on WritingBench (up from 7.93), demonstrating significant improvements over its base model.
- Balanced General Performance: Shows modest improvements of 2%–5% in knowledge and reasoning tasks (CMMLU, MMLU-Pro) and encouraging progress in mathematical reasoning (AIME-2024, AIME-2025, GSM8K).
- Instruction Following: Improved instruction-following capabilities, with an ifeval benchmark score increasing from 71.43 to 74.71.
- Context Length: Supports a substantial context length of 32768 tokens.
Recommended Use Cases
- Creative Content Generation: Ideal for tasks requiring imaginative text, story writing, and diverse stylistic outputs.
- General-Purpose Applications: Suitable for a range of applications benefiting from improved reasoning, knowledge, and mathematical abilities alongside creative text generation.
- Research & Development: Provides a strong foundation for further fine-tuning or experimentation in creative AI and general LLM performance.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.