Name: Neira/Qwen2.5-0.5B_muon_v2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Neira

Model Overview

Neira/Qwen2.5-0.5B_muon_v2 is a fine-tuned variant of the Qwen/Qwen2.5-0.5B base model, featuring 0.5 billion parameters and supporting a substantial context length of 32768 tokens. This model was trained with specific hyperparameters, including a learning rate of 5e-05 and the Muon optimizer, over 1.0 epoch.

Training Details

The training process utilized a train_batch_size of 4, an eval_batch_size of 16, and a gradient_accumulation_steps of 8, resulting in a total_train_batch_size of 32. A cosine learning rate scheduler with 0.01 warmup steps was employed. The training environment included Transformers 5.5.4, Pytorch 2.10.0+cu128, Datasets 4.8.3, and Tokenizers 0.22.2.

Limitations and Further Information

The available documentation does not specify the dataset used for fine-tuning, nor does it detail the model's intended uses, limitations, or specific performance characteristics. Users should be aware that without this information, the model's suitability for particular applications remains undefined.

Overview

Model Overview

Training Details

Limitations and Further Information

Full Model Card (README)