xing720310/qwen3-14b-thinking-1

TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Feb 12, 2026Architecture:Transformer Cold

xing720310/qwen3-14b-thinking-1 is a 14 billion parameter Qwen3-based language model fine-tuned on reasoning datasets derived from DeepSeek v3.2 Speciale. This model is optimized for complex reasoning tasks, including coding and mathematics, and supports a 32768 token context length. It is designed for applications requiring deep research capabilities and robust chat functionalities.

Loading preview...

Overview

xing720310/qwen3-14b-thinking-1 is a 14 billion parameter model built upon the Qwen3 architecture. It has been specifically fine-tuned using reasoning datasets sourced from DeepSeek v3.2 Speciale, leveraging TeichAI/deepseek-v3.2-speciale-OpenCodeReasoning-3k, TeichAI/deepseek-v3.2-speciale-1000x, and TeichAI/deepseek-v3.2-speciale-openr1-math-3k.

Key Capabilities

  • Enhanced Reasoning: Specialized training on DeepSeek v3.2 Speciale reasoning datasets.
  • Efficient Training: Developed using Unsloth and Huggingface's TRL library, enabling 2x faster training.
  • Context Length: Supports a 32768 token context window.

Good For

  • Coding: Excels in programming-related tasks.
  • Mathematics: Optimized for mathematical problem-solving.
  • Deep Research: Suitable for applications requiring in-depth analysis and information retrieval.
  • Chat: Capable of engaging in conversational interactions.