benfielding/Qwen2.5-1.5B-Instruct-Gensyn-Swarm-flightless_skittish_wildebeest

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 4, 2025Architecture:Transformer Cold

benfielding/Qwen2.5-1.5B-Instruct-Gensyn-Swarm-flightless_skittish_wildebeest is a 1.5 billion parameter instruction-tuned causal language model, fine-tuned from Gensyn/Qwen2.5-1.5B-Instruct. This model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities. It is suitable for tasks requiring improved logical and mathematical problem-solving within a compact model size.

Loading preview...

Model Overview

This model, benfielding/Qwen2.5-1.5B-Instruct-Gensyn-Swarm-flightless_skittish_wildebeest, is a 1.5 billion parameter instruction-tuned language model. It is built upon the Gensyn/Qwen2.5-1.5B-Instruct base model and has undergone further fine-tuning.

Key Capabilities & Training

  • Fine-tuned Base Model: Derived from Gensyn/Qwen2.5-1.5B-Instruct, indicating a foundation in general instruction following.
  • GRPO Training Method: A significant differentiator is its training with GRPO (Gradient-based Reasoning Policy Optimization), a method introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models". This suggests an optimization for tasks involving mathematical reasoning.
  • TRL Framework: The fine-tuning process utilized the TRL (Transformer Reinforcement Learning) library, a common framework for advanced LLM training.

Use Cases

  • Mathematical Reasoning: Given its GRPO training, this model is particularly well-suited for applications requiring enhanced mathematical problem-solving and logical deduction.
  • Instruction Following: As an instruction-tuned model, it can effectively respond to a variety of user prompts and instructions.

Technical Details

  • Parameter Count: 1.5 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a context length of 32768 tokens.