vigilantETH/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-mangy_knobby_tuna

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Apr 15, 2025Architecture:Transformer Warm

vigilantETH/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-mangy_knobby_tuna is a 0.5 billion parameter instruction-tuned causal language model, fine-tuned from Gensyn/Qwen2.5-0.5B-Instruct. This model was trained using the TRL framework and incorporates the GRPO method, which is designed to enhance mathematical reasoning capabilities. With a substantial context length of 131072 tokens, it is optimized for tasks requiring deep contextual understanding and improved mathematical problem-solving.

Loading preview...

Overview

vigilantETH/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-mangy_knobby_tuna is a 0.5 billion parameter instruction-tuned model, building upon the Gensyn/Qwen2.5-0.5B-Instruct base. It leverages the TRL (Transformer Reinforcement Learning) framework for its fine-tuning process.

Key Capabilities

  • Enhanced Mathematical Reasoning: This model was trained using the GRPO (Gradient-based Reward Policy Optimization) method, as introduced in the DeepSeekMath paper, specifically to improve its mathematical reasoning abilities.
  • Instruction Following: As an instruction-tuned model, it is designed to understand and execute user prompts effectively.
  • Large Context Window: Features a significant context length of 131072 tokens, allowing it to process and generate longer, more coherent texts while maintaining contextual awareness.

Training Details

The model's training incorporated GRPO, a technique detailed in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" arXiv:2402.03300. The fine-tuning was performed using TRL version 0.15.2, with Transformers 4.51.3 and Pytorch 2.6.0.

Good For

  • Applications requiring strong mathematical reasoning.
  • Tasks benefiting from a large context window for processing extensive inputs or generating detailed outputs.
  • Instruction-following scenarios where a compact yet capable model is desired.