zcyzcyzcy/qwen3-1.7b-jf-v2math811-ar10
The zcyzcyzcy/qwen3-1.7b-jf-v2math811-ar10 model is a 2 billion parameter Qwen3-based language model fine-tuned with Jacobi Forcing on a mixed code and mathematical trajectory dataset. It achieves lossless quality compared to its base autoregressive counterpart while providing a significant 1.5-1.7x wall-clock speedup through Jacobi parallel decoding. This model is optimized for tasks requiring both code generation and mathematical reasoning, offering enhanced inference efficiency.
Loading preview...
Model Overview
The zcyzcyzcy/qwen3-1.7b-jf-v2math811-ar10 is a 2 billion parameter model based on the Qwen3-1.7B architecture. Its primary innovation lies in its fine-tuning with Jacobi Forcing, a technique that enables Jacobi parallel decoding for substantial inference speedups without compromising output quality. This model is specifically trained on a mixed dataset comprising code (OpenCodeInstruct) and mathematical trajectories (OpenThought2).
Key Capabilities & Features
- Lossless Quality: Maintains the same HumanEval pass@1 and GSM8K accuracy as the base autoregressive (AR) generation.
- Significant Speedup: Achieves approximately 1.5-1.7x wall-clock speedup compared to greedy AR inference. Benchmarks show 1.65x on HumanEval and 1.53x on GSM8K.
- Drop-in Compatible: Functions as a standard Qwen3 model for traditional autoregressive generation using HuggingFace
AutoModelForCausalLM. - Specialized Training: Fine-tuned with consistency and AR loss from the JacobiForcing paper, using a progressive noise window strategy.
When to Use This Model
This model is ideal for applications where both code generation and mathematical problem-solving are critical, and where inference speed is a key performance metric. Developers can leverage its Jacobi parallel decoding for faster response times in scenarios like:
- Automated code completion and generation.
- Solving complex mathematical word problems.
- Any task requiring high-quality text generation with a focus on computational efficiency.