UoM-CS-NeuroSymbolicAI/qwen3vl_ins_math_10k
UoM-CS-NeuroSymbolicAI/qwen3vl_ins_math_10k is an 8 billion parameter instruction-tuned language model, fine-tuned from Qwen/Qwen3-VL-8B-Instruct. This model is specifically optimized for mathematical tasks, leveraging the math_interleave dataset during its training. It is designed to enhance performance in numerical reasoning and problem-solving contexts.
Loading preview...
Overview
This model, qwen3vl_ins_math_10k, is an 8 billion parameter instruction-tuned variant of the Qwen3-VL-8B-Instruct architecture. Developed by UoM-CS-NeuroSymbolicAI, its primary differentiation lies in its specialized fine-tuning on the math_interleave dataset.
Key Capabilities
- Enhanced Mathematical Reasoning: Through targeted fine-tuning, the model is designed to improve its performance on mathematical tasks and numerical problem-solving.
- Instruction Following: Retains the instruction-following capabilities of its base Qwen3-VL-8B-Instruct model.
Training Details
The model was trained using specific hyperparameters to optimize its mathematical proficiency:
- Learning Rate: 5e-05
- Batch Size: A total training batch size of 16 (train_batch_size: 4, gradient_accumulation_steps: 2, num_devices: 2).
- Optimizer: ADAMW_TORCH with betas=(0.9, 0.999) and epsilon=1e-08.
- Scheduler: Cosine learning rate scheduler with a warmup ratio of 0.1.
- Epochs: Trained for 2.0 epochs.
Good For
- Applications requiring strong mathematical problem-solving abilities.
- Tasks that benefit from a language model with specialized numerical reasoning.
Limitations
As a fine-tuned model, its general-purpose capabilities might be less emphasized compared to its mathematical strengths. Further information on specific limitations and intended uses is needed for a comprehensive understanding.