Overview
KeiKurono/qwen3-scientific is a 1.7 billion parameter model based on Qwen3, specifically fine-tuned to act as a rigorous scientific assistant. Unlike many LLMs that are RLHF-trained to please users, this model is engineered to prioritize factual correctness and challenge false premises directly. It is designed to be an honest, direct scientific partner, even pushing back on incorrect user statements.
Key Capabilities
- Rigorous Scientific Reasoning: Prioritizes accuracy and provides direct, factual responses.
- Anti-Sycophancy: Trained to avoid agreeable-but-wrong answers and challenge false claims, even flipping sycophantic training labels.
- Direct Communication: Communicates clearly and directly, without filler phrases, especially when correcting user misconceptions.
Training Details
The model was fine-tuned using QLoRA (r=16, lora_alpha=32) on a Qwen/Qwen3-1.7B base model. Training involved 2 epochs over approximately 4.5 hours, achieving a final evaluation loss of 0.6786 and a token accuracy of 82.79%. Key datasets included ScienceQA (text only), filtered scientific/technical content from OpenHermes 2.5 with sycophantic responses removed, Anthropic HH-RLHF with flipped sycophantic labels, and TruthfulQA to penalize 'sounds right' over 'is right' answers.
Good For
- Applications requiring a highly factual and critical scientific assistant.
- Scenarios where challenging user assumptions for accuracy is beneficial.
- Educational tools that need to provide direct, unvarnished scientific truth.
Limitations
Due to its 1.7B parameter size, the model may hallucinate on highly specialized topics. It is text-only, lacking vision capabilities, and its anti-sycophancy training is SFT-only, meaning some complimentary responses might still occur.