zcyzcyzcy/qwen3-1.7b-jf-v2math811-ar10

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Apr 28, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The zcyzcyzcy/qwen3-1.7b-jf-v2math811-ar10 model is a 2 billion parameter Qwen3-based language model fine-tuned with Jacobi Forcing on a mixed code and mathematical trajectory dataset. It achieves lossless quality compared to its base autoregressive counterpart while providing a significant 1.5-1.7x wall-clock speedup through Jacobi parallel decoding. This model is optimized for tasks requiring both code generation and mathematical reasoning, offering enhanced inference efficiency.

Loading preview...

Model Overview

The zcyzcyzcy/qwen3-1.7b-jf-v2math811-ar10 is a 2 billion parameter model based on the Qwen3-1.7B architecture. Its primary innovation lies in its fine-tuning with Jacobi Forcing, a technique that enables Jacobi parallel decoding for substantial inference speedups without compromising output quality. This model is specifically trained on a mixed dataset comprising code (OpenCodeInstruct) and mathematical trajectories (OpenThought2).

Key Capabilities & Features

  • Lossless Quality: Maintains the same HumanEval pass@1 and GSM8K accuracy as the base autoregressive (AR) generation.
  • Significant Speedup: Achieves approximately 1.5-1.7x wall-clock speedup compared to greedy AR inference. Benchmarks show 1.65x on HumanEval and 1.53x on GSM8K.
  • Drop-in Compatible: Functions as a standard Qwen3 model for traditional autoregressive generation using HuggingFace AutoModelForCausalLM.
  • Specialized Training: Fine-tuned with consistency and AR loss from the JacobiForcing paper, using a progressive noise window strategy.

When to Use This Model

This model is ideal for applications where both code generation and mathematical problem-solving are critical, and where inference speed is a key performance metric. Developers can leverage its Jacobi parallel decoding for faster response times in scenarios like:

  • Automated code completion and generation.
  • Solving complex mathematical word problems.
  • Any task requiring high-quality text generation with a focus on computational efficiency.