OzdowntheR/Qwen3-0.6B-SciGen-SLERP is a 0.8 billion parameter Qwen3-based model created by OzdowntheR, resulting from a SLERP merge of two domain-specialized Qwen3-0.6B experts fine-tuned on physics and general chat data. This merge averages a +5.9% improvement over the base model on the HF Open LLM Leaderboard v1 task suite, while suppressing the base Qwen3's explicit thinking behavior. It is optimized for faster, more direct responses, making it suitable for latency-sensitive applications where verbose chain-of-thought is not required.
Loading preview...
Model Overview
OzdowntheR/Qwen3-0.6B-SciGen-SLERP is a 0.8 billion parameter model based on the Qwen3 architecture. It was created by OzdowntheR through a SLERP (Spherical Linear Interpolation) merge of two full-fine-tuned Qwen3-0.6B expert models: one specialized in physics (trained on camel-ai/physics) and another in general chat (trained on HuggingFaceH4/ultrachat_200k). The merge was performed at a t=0.3 ratio, meaning 70% science expert and 30% general expert.
Key Capabilities & Performance
- Improved v1 Leaderboard Performance: Achieves an average of +5.9% over the base Qwen3-0.6B on the HF Open LLM Leaderboard v1 task suite (ARC, HellaSwag, BoolQ, PIQA, Winogrande), with a notable +10% gain on BoolQ.
- Faster, More Direct Responses: The merge suppresses the base Qwen3's explicit thinking behavior, leading to significantly faster and more concise outputs (3-7x fewer tokens in qualitative tests).
- Mixed v2 Leaderboard Results: Shows gains on IFEval and MuSR, but regressions on BBH, MMLU-Pro, and MATH Hard, particularly due to the suppression of reasoning.
Ideal Use Cases
- Research and Experimentation: Primarily intended for exploring model merging techniques and their effects on specialized domains.
- Latency-Sensitive Applications: Suitable for scenarios where quick, direct answers are prioritized over verbose, step-by-step reasoning.
- Factual Q&A: Benefits from the science expert's training, showing strong performance in factual question answering tasks like BoolQ.
Limitations
- Suppressed Reasoning: Not recommended for tasks requiring complex, multi-step reasoning or explicit chain-of-thought, as the model tends to skip these processes.
- Math Regression: Exhibits a significant regression in mathematical problem-solving compared to the base model.
- Small Scale: As a 0.6B parameter model, its absolute performance on hard reasoning benchmarks remains low.