Name: Gueule-d-ange/qwen1.5b-sft-1k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Gueule-d-ange

Model Overview

Gueule-d-ange/qwen1.5b-sft-1k is a fine-tuned variant of the Qwen/Qwen2.5-1.5B base model. This 1.5 billion parameter model was subjected to a single epoch of supervised fine-tuning.

Training Details

The training process utilized a learning rate of 2e-05 with an AdamW optimizer. Key hyperparameters included a total training batch size of 128 (achieved with a train_batch_size of 1 and gradient_accumulation_steps of 16) and mixed-precision training. The training was conducted on 8 devices, employing a cosine learning rate scheduler with a warmup ratio of 0.03.

Current Status

As of now, specific information regarding the dataset used for fine-tuning, the model's intended uses, limitations, and evaluation data is not publicly available. Users should exercise caution and conduct their own evaluations to determine suitability for specific applications.

Overview

Model Overview

Training Details

Current Status

Full Model Card (README)