Surpem/Supertron2-24B

Hugging Face
VISIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:May 18, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Supertron2-24B by Surpem is a 24 billion parameter instruction-tuned causal language model, built upon mistralai/Devstral-Small-2-24B-Instruct-2512, with a 32768 token context length. It is specifically designed for practical coding assistance, structured reasoning, mathematical problem-solving, and scientific explanations. This model excels at multi-step instruction following and general chat, making it a versatile tool for developers and researchers.

Loading preview...

Supertron2-24B: Instruction-Tuned for Coding and Reasoning

Supertron2-24B is a 24 billion parameter instruction-tuned language model developed by Surpem, based on the mistralai/Devstral-Small-2-24B-Instruct-2512 architecture. It is engineered to provide robust assistance across various technical and general tasks, emphasizing practical application and structured problem-solving.

Key Capabilities

  • Coding: Assists with writing, explaining, debugging, and reviewing code, including implementation planning and error analysis.
  • Reasoning: Capable of handling multi-step questions, comparing options, following complex instructions, and generating concise answers.
  • Math & Science: Excels at arithmetic, algebra, word problems, and providing step-by-step mathematical explanations. It can also clarify scientific concepts and aid in STEM-related writing.
  • General Chat: Supports writing, brainstorming, summarization, planning, and answering everyday questions.

Intended Use Cases

  • Coding assistance and software engineering reasoning.
  • Mathematical and scientific problem-solving and explanations.
  • General instruction following, chat, and content generation (writing, summarization, brainstorming).
  • Research and technical assistance.

Hardware Requirements

  • bfloat16: Minimum 48 GB VRAM, recommended 80 GB+.
  • 4-bit quantized: Minimum 16 GB VRAM, recommended 24 GB+.

Users should be aware that the model may produce errors or outdated information and should not be used as the sole source for critical decisions.