NovatasticRoScript/Atomight-2-1.5B-Thinking
NovatasticRoScript/Atomight-2-1.5B-Thinking is a 1.5 billion parameter small language model specifically engineered for deep reasoning and sequential logic chains. It utilizes an explicit internal ... scratchpad for transparent auditing of complex mathematical and logical prompts. Optimized for constrained hardware environments, this model excels in structured mathematical deduction and core reasoning tasks, making it suitable for applications requiring robust logical processing on consumer-grade hardware.
Loading preview...
Atomight-2-1.5B-Thinking: Deep Reasoning on Constrained Hardware
Atomight-2-1.5B-Thinking is a compact 1.5 billion parameter model from NovatasticRoScript, designed for advanced reasoning tasks even on limited hardware like free Google Colab T4 instances. Its core innovation is an explicit internal <think>...</think> scratchpad, which allows the model to dynamically break down complex prompts and show its reasoning process before providing a final answer. This structured approach makes its logic transparent and auditable.
Key Capabilities
- Hardware Democratic: Delivers deep reasoning capabilities on consumer-grade hardware and free cloud compute tiers.
- Structured Scratchpad: Generates visible, native reasoning pathways, ideal for auditing and understanding the model's thought process.
- Chat-Template Native: Fully optimized for ChatML system configurations, ensuring clean deployment and inference.
Performance Highlights
Atomight-2-1.5B-Thinking demonstrates exceptional specialization in structured mathematical deduction and core reasoning. It achieves 80.1% on GSM8k (Math Logical Chains) and 88.5% on ARC-C (Core Reasoning), outperforming several larger models in its class, including Qwen-2-1.5B-Instruct and Llama-3.2-3B-Instruct, and closely rivaling Phi-3-mini (3.8B) in these specific areas. Its MMLU score is 63.2%.
Important Considerations
While highly specialized in textual logic and mathematical proofs, the model exhibits a known limitation in abstract visual transformation tasks (e.g., ARC-AGI 2), scoring 0.00%. This indicates a current cognitive bottleneck in translating spatial imagery into basic structural text tokens, an area targeted for future architectural improvements.
Good for
- Applications requiring robust logical and mathematical reasoning on resource-constrained devices.
- Use cases where transparent, auditable reasoning steps are crucial.
- Developers working within ChatML frameworks who need a specialized reasoning model.