HallD/SkeptiSTEM-4B-stageR1-merged-16bit

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Dec 17, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

HallD/SkeptiSTEM-4B-stageR1-merged-16bit is a 4 billion parameter language model developed by HallD, representing a merged 16-bit checkpoint from the SkeptiSTEM-4B series. This specific version is the result of Stage R1 STEM Supervised Fine-Tuning (SFT), indicating its specialization in STEM-related tasks. With a 40960 token context length, it is designed for applications requiring deep understanding and generation within scientific, technological, engineering, and mathematical domains.

Loading preview...

SkeptiSTEM-4B-stageR1-merged-16bit Overview

This model, developed by HallD, is a 4 billion parameter language model specifically designed for STEM (Science, Technology, Engineering, and Mathematics) applications. It represents a merged 16-bit checkpoint derived from the SkeptiSTEM-4B series, having undergone Stage R1 Supervised Fine-Tuning (SFT) with a focus on STEM-related data.

Key Capabilities

  • STEM Specialization: Optimized through Supervised Fine-Tuning (SFT) on STEM datasets, indicating enhanced performance in scientific and technical domains.
  • Parameter Efficiency: At 4 billion parameters, it offers a balance between capability and computational efficiency.
  • Extended Context Window: Features a substantial 40960 token context length, suitable for processing lengthy technical documents, code, or complex problem descriptions.

Good For

  • Scientific Research: Assisting with literature review, hypothesis generation, or data interpretation in STEM fields.
  • Technical Documentation: Generating or summarizing technical reports, manuals, and specifications.
  • Educational Tools: Developing AI tutors or content generators for STEM education.
  • Code-Related Tasks: Potentially useful for understanding or generating code snippets, given its STEM focus.