Name: hZzy/qwen2-0.5b-sft API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: hZzy

Model Overview

The hZzy/qwen2-0.5b-sft is a compact 0.5 billion parameter language model, fine-tuned from the base Qwen/Qwen2-0.5B architecture. This instruction-tuned variant was developed by hZzy, utilizing the HuggingFaceH4/ultrachat_200k dataset for supervised fine-tuning (SFT).

Training Details

The model was trained for 1 epoch with a learning rate of 2e-05, using a total batch size of 192 across 3 devices with gradient accumulation steps of 8. The optimizer used was Adam with default betas and epsilon, and a cosine learning rate scheduler with a warmup ratio of 0.1. Mixed-precision training (Native AMP) was employed. During training, the model achieved a validation loss of 1.5327.

Potential Use Cases

Given its instruction-tuned nature and compact size, this model is suitable for:

Lightweight conversational agents: Deploying in environments with limited computational resources.
Quick prototyping: Rapidly testing and iterating on language-based applications.
Educational purposes: Understanding the principles of instruction tuning on smaller models.

Overview

Model Overview

Training Details

Potential Use Cases

Full Model Card (README)