pymlex/qwen3-4b-gsm8k

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 9, 2026License:gpl-3.0Architecture:Transformer Open Weights Warm

The pymlex/qwen3-4b-gsm8k model is a 4 billion parameter Qwen3 checkpoint fine-tuned by pymlex specifically for grade-school mathematical reasoning. It is optimized to generate a detailed reasoning trace within tags and the final numerical answer within tags. This model excels at solving GSM8K-style math problems by explicitly structuring the thought process and final result. It has a context length of 32768 tokens.

Loading preview...

Model Overview

pymlex/qwen3-4b-gsm8k is a 4 billion parameter Qwen3 model, fine-tuned by pymlex on the openai/gsm8k dataset. This specialized model is designed to solve grade-school math problems by explicitly separating the reasoning process from the final answer. It generates a detailed thought process within <think>...</think> tags and the numerical result within <answer>...</answer> tags.

Key Capabilities

  • Structured Mathematical Reasoning: Generates step-by-step reasoning for math problems, making the problem-solving process transparent.
  • GSM8K Optimization: Specifically trained on the GSM8K dataset, focusing on grade-school level arithmetic and word problems.
  • Output Formatting: Adheres to a specific output format, providing a clear distinction between the thought process and the final answer.

Training Details

The model was fine-tuned using LoRA and supervised fine-tuning on a single NVIDIA GeForce RTX 5090 GPU. The training utilized 7,099 samples from the GSM8K dataset, with a maximum sequence length of 768 tokens. Evaluation on the test set showed a perplexity of 1.4107 and an exact match accuracy of 0.0955.

Good For

  • Educational Tools: Developing AI assistants for math education that can show their work.
  • Reasoning Trace Generation: Applications requiring explicit, step-by-step logical deductions for numerical problems.
  • Benchmarking: Evaluating structured reasoning capabilities on mathematical tasks.