pymlex/qwen3-4b-gsm8k
The pymlex/qwen3-4b-gsm8k model is a 4 billion parameter Qwen3 checkpoint fine-tuned by pymlex specifically for grade-school mathematical reasoning. It is optimized to generate a detailed reasoning trace within tags and the final numerical answer within tags. This model excels at solving GSM8K-style math problems by explicitly structuring the thought process and final result. It has a context length of 32768 tokens.
Loading preview...
Model Overview
pymlex/qwen3-4b-gsm8k is a 4 billion parameter Qwen3 model, fine-tuned by pymlex on the openai/gsm8k dataset. This specialized model is designed to solve grade-school math problems by explicitly separating the reasoning process from the final answer. It generates a detailed thought process within <think>...</think> tags and the numerical result within <answer>...</answer> tags.
Key Capabilities
- Structured Mathematical Reasoning: Generates step-by-step reasoning for math problems, making the problem-solving process transparent.
- GSM8K Optimization: Specifically trained on the GSM8K dataset, focusing on grade-school level arithmetic and word problems.
- Output Formatting: Adheres to a specific output format, providing a clear distinction between the thought process and the final answer.
Training Details
The model was fine-tuned using LoRA and supervised fine-tuning on a single NVIDIA GeForce RTX 5090 GPU. The training utilized 7,099 samples from the GSM8K dataset, with a maximum sequence length of 768 tokens. Evaluation on the test set showed a perplexity of 1.4107 and an exact match accuracy of 0.0955.
Good For
- Educational Tools: Developing AI assistants for math education that can show their work.
- Reasoning Trace Generation: Applications requiring explicit, step-by-step logical deductions for numerical problems.
- Benchmarking: Evaluating structured reasoning capabilities on mathematical tasks.