Supertron1-4B, developed by Surpem, is a 4 billion parameter instruction-tuned causal language model built upon Qwen3-4B. It is designed as an efficient daily driver, excelling in math, coding, reasoning, and general conversation. This model delivers strong performance comparable to larger models in the 4-8B class, notably surpassing Mistral 7B on key benchmarks, making it suitable for consumer hardware.
Loading preview...
Supertron1-4B: An Efficient Instruction-Tuned Language Model
Supertron1-4B, developed by Surpem, is a 4 billion parameter instruction-tuned causal language model based on Qwen3-4B. It is engineered to be a reliable and efficient daily driver, offering robust performance across a variety of tasks including math, coding, reasoning, and general conversation. Its lightweight design allows it to run effectively on consumer hardware.
Key Capabilities
- Strong Benchmark Performance: Surpasses Mistral 7B across all core benchmarks despite having nearly half the parameters.
- Math and Coding Proficiency: Demonstrates strong performance on GSM8K (math) and HumanEval (coding) benchmarks, indicating focused tuning in these areas.
- Efficiency: Achieves competitive results with models like Phi-4 mini using significantly less computational resources.
- Versatile Use: Capable across general conversation, reasoning, and specialized technical tasks.
Good For
- Applications requiring a capable yet efficient language model.
- Scenarios where strong math and coding abilities are crucial.
- Deployment on consumer-grade hardware due to its optimized size and performance.
- Developers seeking a reliable daily driver for a broad range of instruction-following tasks.