Gensyn/Qwen2.5-1.5B-Instruct

Warm
Public
1.5B
BF16
131072
Apr 4, 2025
License: apache-2.0
Hugging Face
Overview

Overview

This model, Gensyn/Qwen2.5-1.5B-Instruct, is an unmodified version of the instruction-tuned 1.54 billion parameter Qwen2.5 model. It is built on a transformer architecture incorporating RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings. The model features 28 layers, 12 attention heads for Q, and 2 for KV (GQA), with a full context length of 32,768 tokens and generation up to 8192 tokens.

Key Purpose

The primary purpose of this specific model release is its integration into the Gensyn RL Swarm system. It is designed to be locally fine-tuned using peer-to-peer reinforcement learning post-training within this environment. After fine-tuning, the model can be utilized in standard workflows.

Integration and Usage

Users interested in deploying this model within a swarm or participating in the Gensyn Testnet should refer to the instructions provided in the RL Swarm repository. For general usage and detailed documentation of the original Qwen2.5 model, users are directed to the original model documentation and its Hugging Face repository.