Zhihu-ai/Zhi-Create-DSR1-14B
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Apr 19, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Zhihu-ai/Zhi-writing-dsr1-14b is a 14 billion parameter language model developed by Zhihu-ai, fine-tuned from DeepSeek-R1-Distill-Qwen-14B. Optimized for creative writing, it demonstrates improved performance on benchmarks like LLM Creative Story-Writing Benchmark and WritingBench. With a 32768 token context length, it also shows modest gains in knowledge, reasoning, and mathematical tasks, making it suitable for general-purpose applications requiring enhanced creative text generation.

Loading preview...

Model Overview

Zhihu-ai/Zhi-writing-dsr1-14b is a 14 billion parameter model, fine-tuned from DeepSeek-R1-Distill-Qwen-14B, specifically engineered to enhance creative writing capabilities. The model leverages a comprehensive training process involving rigorously filtered open-source datasets, chain-of-thought reasoning corpora, and high-quality content from Zhihu, all processed through a Reward Model (RM) filtering pipeline. Training utilized a curriculum learning strategy for Supervised Fine-tuning (SFT) and Step-DPO with DPOP for Direct Preference Optimization (DPO).

Key Capabilities & Performance

  • Enhanced Creative Writing: Achieved a score of 8.33 on the LLM Creative Story-Writing Benchmark (up from 7.87) and 8.46 on WritingBench (up from 7.93), demonstrating significant improvements over its base model.
  • Balanced General Performance: Shows modest improvements of 2%–5% in knowledge and reasoning tasks (CMMLU, MMLU-Pro) and encouraging progress in mathematical reasoning (AIME-2024, AIME-2025, GSM8K).
  • Instruction Following: Improved instruction-following capabilities, with an ifeval benchmark score increasing from 71.43 to 74.71.
  • Context Length: Supports a substantial context length of 32768 tokens.

Recommended Use Cases

  • Creative Content Generation: Ideal for tasks requiring imaginative text, story writing, and diverse stylistic outputs.
  • General-Purpose Applications: Suitable for a range of applications benefiting from improved reasoning, knowledge, and mathematical abilities alongside creative text generation.
  • Research & Development: Provides a strong foundation for further fine-tuning or experimentation in creative AI and general LLM performance.
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p