qianyuuu/qwen3-1.7B-sft-instruct-ckpt350

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:May 16, 2026Architecture:Transformer Warm

The qianyuuu/qwen3-1.7B-sft-instruct-ckpt350 is a 1.7 billion parameter instruction-tuned language model based on the Qwen3-1.7B-Base architecture. This model is specifically fine-tuned using data distilled from qwen3-4B-instruct-2507 on the DeepMath 20k dataset, indicating a specialization in mathematical reasoning and problem-solving. With a context length of 32768 tokens, it is designed for tasks requiring deep understanding and generation of mathematical content.

Loading preview...

Model Overview

The qianyuuu/qwen3-1.7B-sft-instruct-ckpt350 is a 1.7 billion parameter instruction-tuned model built upon the Qwen3-1.7B-Base architecture. It features a substantial context length of 32768 tokens, enabling it to process and generate extensive text sequences.

Key Differentiator

This model's primary distinction lies in its specialized training. It has been fine-tuned using data distilled from the qwen3-4B-instruct-2507 model, specifically leveraging the DeepMath 20k dataset. This targeted training suggests an optimization for:

  • Mathematical Reasoning: Excelling in tasks that require logical deduction and problem-solving within mathematical domains.
  • Instruction Following: Designed to accurately interpret and execute complex instructions, particularly those related to mathematical queries.

Use Cases

Given its specialized training, this model is particularly well-suited for applications requiring strong mathematical capabilities, such as:

  • Solving mathematical problems.
  • Generating explanations for mathematical concepts.
  • Assisting in educational tools focused on mathematics.
  • Developing agents for tasks involving quantitative analysis.