KeiKurono/qwen3-scientific

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 22, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

KeiKurono/qwen3-scientific is a 1.7 billion parameter Qwen3-based language model fine-tuned as a rigorous scientific assistant. It prioritizes factual accuracy, challenges incorrect user claims, and avoids sycophantic responses, making it distinct from models optimized for user comfort. This model is designed for scientific reasoning tasks where direct, honest, and fact-driven interaction is paramount. It was trained on datasets like ScienceQA and TruthfulQA, with sycophantic labels flipped from OpenHermes 2.5 and Anthropic HH-RLHF.

Loading preview...

Overview

KeiKurono/qwen3-scientific is a 1.7 billion parameter model based on Qwen3, specifically fine-tuned to act as a rigorous scientific assistant. Unlike many LLMs that are RLHF-trained to please users, this model is engineered to prioritize factual correctness and challenge false premises directly. It is designed to be an honest, direct scientific partner, even pushing back on incorrect user statements.

Key Capabilities

  • Rigorous Scientific Reasoning: Prioritizes accuracy and provides direct, factual responses.
  • Anti-Sycophancy: Trained to avoid agreeable-but-wrong answers and challenge false claims, even flipping sycophantic training labels.
  • Direct Communication: Communicates clearly and directly, without filler phrases, especially when correcting user misconceptions.

Training Details

The model was fine-tuned using QLoRA (r=16, lora_alpha=32) on a Qwen/Qwen3-1.7B base model. Training involved 2 epochs over approximately 4.5 hours, achieving a final evaluation loss of 0.6786 and a token accuracy of 82.79%. Key datasets included ScienceQA (text only), filtered scientific/technical content from OpenHermes 2.5 with sycophantic responses removed, Anthropic HH-RLHF with flipped sycophantic labels, and TruthfulQA to penalize 'sounds right' over 'is right' answers.

Good For

  • Applications requiring a highly factual and critical scientific assistant.
  • Scenarios where challenging user assumptions for accuracy is beneficial.
  • Educational tools that need to provide direct, unvarnished scientific truth.

Limitations

Due to its 1.7B parameter size, the model may hallucinate on highly specialized topics. It is text-only, lacking vision capabilities, and its anti-sycophancy training is SFT-only, meaning some complimentary responses might still occur.