Name: khazarai/HeisenbergQ-0.5B-RL API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: khazarai

Model Overview

HeisenbergQ-0.5B-RL, developed by khazarai, is a specialized 0.5 billion parameter language model. It is a fine-tuned version of Qwen2.5-0.5B-Instruct, uniquely optimized for quantum physics reasoning. The model leverages GRPO (Grouped Relative Policy Optimization) with custom reward functions to enhance its performance in this domain.

Key Capabilities

Quantum Physics Problem Solving: Designed to solve and reason through complex quantum physics problems.
Structured Output: Produces answers in a specific XML format, including <reasoning> and <answer> tags, facilitating clear, step-by-step logical thought processes.
Scientific Reasoning: Excels in general scientific reasoning within mathematics and physics contexts.
Lightweight: Its 0.5B parameter size makes it a lightweight option for specialized tasks.

Training Details

The model was fine-tuned using GRPO with LoRA on the jilp00/YouToks-Instruct-Quantum-Physics-II dataset. Its training incorporated unique reward models:

Reasoning Quality Reward: Encourages logical markers and coherent chains of thought.
Token Count Reward: Prevents overly verbose or sparse explanations.
XML Reward: Strictly enforces the <reasoning> / <answer> output format.
Soft Format Reward: Ensures robust handling of edge cases in formatting.

Limitations

Due to its specialized training on approximately 1,000 samples, the model may exhibit hallucinations outside the physics domain. Its small parameter size, while making it lightweight, also means its reasoning depth is inherently limited compared to much larger general-purpose models.