pixas/Miner-8B
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 9, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

pixas/Miner-8B is an 8 billion parameter reasoning model developed by pixas, trained with the MINER reinforcement learning method. This method enhances data efficiency for large reasoning models by leveraging intrinsic uncertainty as a self-supervised reward signal. It is specifically designed to improve performance on reasoning and problem-solving tasks, particularly in scenarios where standard RL methods are inefficient due to homogeneous positive prompts. The model incorporates token-level focal credit assignment and adaptive advantage calibration to achieve stronger sample efficiency and accuracy on various reasoning benchmarks.

Loading preview...