pixas/Miner-4B
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 9, 2026License:apache-2.0Architecture:Transformer Open Weights Cold
Miner-4B is a 4 billion parameter reasoning model developed by pixas, trained with the MINER reinforcement learning method. This method enhances data efficiency for large reasoning models by leveraging intrinsic uncertainty as a self-supervised reward signal. It is specifically designed to improve performance on reasoning and problem-solving tasks, particularly in scenarios where standard RL methods are inefficient. The model is intended for research and experimental use in areas like mathematical reasoning and RL for language models.
Loading preview...