C10X/LongWriter-Qwen2.5-7B-Instruct

Warm
Public
7.6B
FP8
131072
Hugging Face
Overview

C10X/LongWriter-Qwen2.5-7B-Instruct Overview

This model is a specialized 7.6 billion parameter instruction-tuned variant of the Qwen2.5-7B-Instruct, developed by C10X. Its core distinction lies in its significantly extended long-context generation capabilities, enabling it to produce outputs of over 10,000 words in a single generation.

Key Capabilities & Training

  • Long-form Text Generation: Optimized for generating extensive textual content, addressing a common limitation in many LLMs.
  • Base Model: Built upon the robust Qwen2.5-7B-Instruct architecture.
  • Data Distillation: Achieves its long-writing proficiency through fine-tuning on a highly curated and filtered dataset, specifically the LongWriter-6k-filtered dataset, which contains 666 high-quality samples distilled from LongWriter-6k.
  • Additional Training Data: Further fine-tuned using samples from Magpie-Qwen2-Pro-200K-Chinese and Magpie-Qwen2-Pro-200K-English datasets.
  • Fine-tuning Method: Utilizes ms-swift for the fine-tuning process, including an annealing strategy to enhance post-training performance.

Ideal Use Cases

  • Content Creation: Generating articles, reports, stories, or any document requiring substantial length.
  • Summarization: Processing and summarizing very long documents.
  • Creative Writing: Assisting with novels, scripts, or other long-form creative projects.
  • Technical Documentation: Producing detailed manuals or specifications.