MegaScience/Qwen3-4B-MegaScience

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Jul 18, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Qwen3-4B-MegaScience is a 4 billion parameter Qwen3 series large language model developed by MegaScience, specifically fine-tuned for scientific reasoning tasks. This model leverages the MegaScience dataset, which includes 650k reasoning questions from 12k university-level scientific textbooks across 7 disciplines, to significantly enhance performance in scientific domains. It is optimized for accurate and efficient scientific problem-solving, outperforming general instruct models in average scientific performance.

Loading preview...

MegaScience/Qwen3-4B-MegaScience: Scientific Reasoning LLM

Qwen3-4B-MegaScience is a 4 billion parameter model from the Qwen3 series, specifically fine-tuned for advanced scientific reasoning. Developed by MegaScience, this model addresses the gap in open-source, high-quality scientific reasoning datasets by utilizing the proprietary MegaScience dataset. This dataset comprises 1.25 million instances, including 650,000 reasoning questions derived from 12,000 university-level scientific textbooks across seven disciplines.

Key Capabilities & Differentiators

  • Specialized Scientific Reasoning: Excels in complex scientific problem-solving, outperforming general-purpose instruct models in scientific benchmarks.
  • High-Quality Training Data: Trained on the MegaScience dataset, which features verifiable reference answers and systematic data selection methodologies for optimal subset identification.
  • Comprehensive Evaluation: Evaluated against a robust system covering diverse subjects and question types across 15 benchmarks, ensuring accurate performance metrics.
  • Improved Efficiency: Achieves superior performance and training efficiency with more concise response lengths compared to existing open-source scientific datasets.
  • Scaling Benefits: Demonstrates greater effectiveness with larger and stronger base models, indicating a scaling advantage for scientific tuning.

Ideal Use Cases

  • Scientific Research Assistance: Supporting human researchers with complex scientific inquiries and problem-solving.
  • Educational Tools: Developing AI scientists or advanced learning tools for university-level science education.
  • Domain-Specific Applications: Applications requiring deep understanding and reasoning within natural science domains.

This model is a result of extensive research into post-training datasets for science reasoning, detailed in the paper "MegaScience: Pushing the Frontiers of Post-Training Datasets for Science Reasoning".