Overview
C10X/LongWriter-Qwen2.5-7B-Instruct Overview
This model is a specialized 7.6 billion parameter instruction-tuned variant of the Qwen2.5-7B-Instruct, developed by C10X. Its core distinction lies in its significantly extended long-context generation capabilities, enabling it to produce outputs of over 10,000 words in a single generation.
Key Capabilities & Training
- Long-form Text Generation: Optimized for generating extensive textual content, addressing a common limitation in many LLMs.
- Base Model: Built upon the robust Qwen2.5-7B-Instruct architecture.
- Data Distillation: Achieves its long-writing proficiency through fine-tuning on a highly curated and filtered dataset, specifically the
LongWriter-6k-filtereddataset, which contains 666 high-quality samples distilled fromLongWriter-6k. - Additional Training Data: Further fine-tuned using samples from
Magpie-Qwen2-Pro-200K-ChineseandMagpie-Qwen2-Pro-200K-Englishdatasets. - Fine-tuning Method: Utilizes
ms-swiftfor the fine-tuning process, including an annealing strategy to enhance post-training performance.
Ideal Use Cases
- Content Creation: Generating articles, reports, stories, or any document requiring substantial length.
- Summarization: Processing and summarizing very long documents.
- Creative Writing: Assisting with novels, scripts, or other long-form creative projects.
- Technical Documentation: Producing detailed manuals or specifications.