kosa-4B-it-v1: Enhanced Instruction-Tuned Model

kosa-4B-it-v1 is a 4 billion parameter instruction-tuned model developed by Kosa Labs, an independent UK-based lab. It is built on the Qwen/Qwen3-4B-Instruct-2507 architecture, featuring a substantial 32768 token context length.

Key Capabilities & Performance

This model demonstrates notable improvements over its base model across several critical benchmarks, indicating enhanced reasoning and instruction-following abilities:

GSM8K (Mathematical Reasoning): Achieves 84.23% (strict) and 85.60% (flexible), significantly outperforming the base model's 73.24% and 79.15% respectively.
IFEval (Instruction Following): Shows strong performance with 85.77% (prompt strict) and 90.29% (instruction strict).
ARC-Challenge (Common Sense Reasoning): Improved to 52.13% (acc_norm) from 43.09%.
MMLU (General Knowledge): Reaches 65.76%, up from 61.89%.

Overall, kosa-4B-it-v1 achieves an average benchmark score of 77.30%, a substantial increase from the base model's 71.56%. These evaluations were conducted under identical settings using lm-evaluation-harness 0.4.12, vLLM, and bfloat16, with rigorous training data verification against benchmark test sets.

Usage & Availability

The model is readily available for use with the Hugging Face transformers library. GGUF quantizations (Q4_K_M, Q5_K_M, Q8_0) are also provided for efficient local deployment.

Overview

kosa-4B-it-v1: Enhanced Instruction-Tuned Model

Key Capabilities & Performance

Usage & Availability

Full Model Card (README)