Degandance/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-freckled_waddling_viper

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Apr 8, 2025Architecture:Transformer Warm

Degandance/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-freckled_waddling_viper is a 0.5 billion parameter instruction-tuned causal language model, fine-tuned from unsloth/Qwen2.5-0.5B-Instruct. This model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities. With a substantial context length of 131072 tokens, it is optimized for tasks requiring deep contextual understanding and robust mathematical problem-solving.

Loading preview...

Model Overview

Degandance/Qwen2.5-0.5B-Instruct-Gensyn-Swarm-freckled_waddling_viper is a 0.5 billion parameter instruction-tuned language model, building upon the unsloth/Qwen2.5-0.5B-Instruct base. It was fine-tuned using the TRL library, specifically incorporating the GRPO (Gradient-based Reward Policy Optimization) method.

Key Capabilities

  • Enhanced Mathematical Reasoning: The model's training with GRPO, a method introduced in the "DeepSeekMath" paper, suggests a focus on improving mathematical problem-solving and reasoning abilities.
  • Instruction Following: As an instruction-tuned model, it is designed to accurately interpret and execute user prompts and instructions.
  • Large Context Window: It supports a context length of 131072 tokens, allowing it to process and generate responses based on extensive input information.

Training Details

The model was trained using TRL version 0.18.2, with Transformers 4.52.4 and PyTorch 2.7.1. The GRPO method, detailed in the paper DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models, was central to its training procedure.

Good For

  • Applications requiring strong mathematical reasoning.
  • Tasks benefiting from a large context window for understanding complex instructions or long documents.
  • Instruction-following tasks where precise adherence to prompts is crucial.