Name: sathiiiii/polyalign-qwen2.5-1.5b-en-sft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sathiiiii

Model Overview

The sathiiiii/polyalign-qwen2.5-1.5b-en-sft model is a specialized variant of the Qwen/Qwen2.5-1.5B architecture, featuring 1.5 billion parameters. It has undergone supervised fine-tuning (SFT) using the polyalign_train dataset, which suggests an optimization for tasks relevant to this specific training data.

Training Details

The model was trained with a learning rate of 1e-05, a total batch size of 64 (achieved with train_batch_size: 2 and gradient_accumulation_steps: 4 across 8 devices), and a cosine learning rate scheduler with a 0.1 warmup ratio. Training was conducted for 1.0 epoch using the AdamW_TORCH_FUSED optimizer and Native AMP for mixed-precision training. During evaluation, it achieved a loss of 1.4072 on the evaluation set.

Intended Use

While specific intended uses and limitations are not detailed in the provided information, its fine-tuning on the polyalign_train dataset implies suitability for tasks aligned with the characteristics of that data. Developers should consider the base Qwen2.5-1.5B capabilities and the specific fine-tuning for potential applications.

Overview

Model Overview

Training Details

Intended Use

Full Model Card (README)