Overview
This model, andresnowak/Qwen3-0.6B-instruction-finetuned, is an 0.8 billion parameter instruction-tuned language model. It is a fine-tuned version of unsloth/Qwen3-0.6B-Base, developed by andresnowak using the TRL (Transformer Reinforcement Learning) library.
Key Capabilities
- Instruction Following: Fine-tuned for general instruction-following tasks.
- Robustness: Training included applying random templates to enhance robustness to diverse question phrasing, alongside a high-quality dataset.
- Diverse Training Data: Utilizes a mixture of datasets covering code (CodeAlpaca, CodeV2), mathematics (OpenMathGsm8k, MathAlgebra, MathGrade, MathV5, TirMath), and general instruction data (NoRobots, FlanV2, IfData, Oasst1, Sciriff, TableGpt, WildChat).
Good For
- General Purpose Chatbots: Suitable for applications requiring a small, instruction-tuned model to respond to a variety of prompts.
- Educational Tools: Can be applied in scenarios requiring basic reasoning and problem-solving, given its training on math and science-related datasets.
- Experimentation: A good candidate for researchers and developers looking to experiment with instruction-tuned models in the 0.8B parameter range, especially those interested in the impact of diverse dataset mixtures and template application during training.
Performance Highlights
The model achieved an Overall Accuracy of 37.8% across various benchmarks. Notable individual benchmark results include:
- ARC Challenge: 46.0%
- MMLU: 47.2%
- GPQA: 29.9%
- Math QA: 24.0%