babycielou/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-scampering_thick_alpaca
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:May 9, 2025Architecture:Transformer Warm

The babycielou/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-scampering_thick_alpaca model is a 0.5 billion parameter instruction-tuned causal language model, fine-tuned from unsloth/Qwen2.5-0.5B-Instruct. It was trained using the TRL framework and incorporates the GRPO method, which is designed to enhance mathematical reasoning capabilities. This model is optimized for tasks requiring structured reasoning and precise responses, particularly in mathematical contexts.

Loading preview...