UoM-CS-NeuroSymbolicAI/qwen3vl_ins_math_10k

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 19, 2026License:otherArchitecture:Transformer Cold

UoM-CS-NeuroSymbolicAI/qwen3vl_ins_math_10k is an 8 billion parameter instruction-tuned language model, fine-tuned from Qwen/Qwen3-VL-8B-Instruct. This model is specifically optimized for mathematical tasks, leveraging the math_interleave dataset during its training. It is designed to enhance performance in numerical reasoning and problem-solving contexts.

Loading preview...

Overview

This model, qwen3vl_ins_math_10k, is an 8 billion parameter instruction-tuned variant of the Qwen3-VL-8B-Instruct architecture. Developed by UoM-CS-NeuroSymbolicAI, its primary differentiation lies in its specialized fine-tuning on the math_interleave dataset.

Key Capabilities

  • Enhanced Mathematical Reasoning: Through targeted fine-tuning, the model is designed to improve its performance on mathematical tasks and numerical problem-solving.
  • Instruction Following: Retains the instruction-following capabilities of its base Qwen3-VL-8B-Instruct model.

Training Details

The model was trained using specific hyperparameters to optimize its mathematical proficiency:

  • Learning Rate: 5e-05
  • Batch Size: A total training batch size of 16 (train_batch_size: 4, gradient_accumulation_steps: 2, num_devices: 2).
  • Optimizer: ADAMW_TORCH with betas=(0.9, 0.999) and epsilon=1e-08.
  • Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1.
  • Epochs: Trained for 2.0 epochs.

Good For

  • Applications requiring strong mathematical problem-solving abilities.
  • Tasks that benefit from a language model with specialized numerical reasoning.

Limitations

As a fine-tuned model, its general-purpose capabilities might be less emphasized compared to its mathematical strengths. Further information on specific limitations and intended uses is needed for a comprehensive understanding.