Surpem/Supertron1-4B
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 13, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Supertron1-4B, developed by Surpem, is a 4 billion parameter instruction-tuned causal language model built upon Qwen3-4B. It is designed as an efficient daily driver, excelling in math, coding, reasoning, and general conversation. This model delivers strong performance comparable to larger models in the 4-8B class, notably surpassing Mistral 7B on key benchmarks, making it suitable for consumer hardware.

Loading preview...

Supertron1-4B: An Efficient Instruction-Tuned Language Model

Supertron1-4B, developed by Surpem, is a 4 billion parameter instruction-tuned causal language model based on Qwen3-4B. It is engineered to be a reliable and efficient daily driver, offering robust performance across a variety of tasks including math, coding, reasoning, and general conversation. Its lightweight design allows it to run effectively on consumer hardware.

Key Capabilities

  • Strong Benchmark Performance: Surpasses Mistral 7B across all core benchmarks despite having nearly half the parameters.
  • Math and Coding Proficiency: Demonstrates strong performance on GSM8K (math) and HumanEval (coding) benchmarks, indicating focused tuning in these areas.
  • Efficiency: Achieves competitive results with models like Phi-4 mini using significantly less computational resources.
  • Versatile Use: Capable across general conversation, reasoning, and specialized technical tasks.

Good For

  • Applications requiring a capable yet efficient language model.
  • Scenarios where strong math and coding abilities are crucial.
  • Deployment on consumer-grade hardware due to its optimized size and performance.
  • Developers seeking a reliable daily driver for a broad range of instruction-following tasks.