kosa-labs/kosa-4B-it-v1
kosa-labs/kosa-4B-it-v1 is a 4 billion parameter instruction-tuned causal language model developed by Kosa Labs, built upon Qwen/Qwen3-4B-Instruct-2507. This model features a 32768 token context length and demonstrates significant performance improvements across reasoning and instruction-following benchmarks, including GSM8K, IFEval, ARC-Challenge, and MMLU. It is optimized for enhanced accuracy in complex problem-solving and general instruction adherence.
Loading preview...
kosa-4B-it-v1: Enhanced Instruction-Tuned Model
kosa-4B-it-v1 is a 4 billion parameter instruction-tuned model developed by Kosa Labs, an independent UK-based lab. It is built on the Qwen/Qwen3-4B-Instruct-2507 architecture, featuring a substantial 32768 token context length.
Key Capabilities & Performance
This model demonstrates notable improvements over its base model across several critical benchmarks, indicating enhanced reasoning and instruction-following abilities:
- GSM8K (Mathematical Reasoning): Achieves 84.23% (strict) and 85.60% (flexible), significantly outperforming the base model's 73.24% and 79.15% respectively.
- IFEval (Instruction Following): Shows strong performance with 85.77% (prompt strict) and 90.29% (instruction strict).
- ARC-Challenge (Common Sense Reasoning): Improved to 52.13% (acc_norm) from 43.09%.
- MMLU (General Knowledge): Reaches 65.76%, up from 61.89%.
Overall, kosa-4B-it-v1 achieves an average benchmark score of 77.30%, a substantial increase from the base model's 71.56%. These evaluations were conducted under identical settings using lm-evaluation-harness 0.4.12, vLLM, and bfloat16, with rigorous training data verification against benchmark test sets.
Usage & Availability
The model is readily available for use with the Hugging Face transformers library. GGUF quantizations (Q4_K_M, Q5_K_M, Q8_0) are also provided for efficient local deployment.