Name: AIR-hl/Qwen2.5-1.5B-ultrachat200k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: AIR-hl

Overview

AIR-hl/Qwen2.5-1.5B-ultrachat200k is a 1.5 billion parameter instruction-tuned model, building upon the Qwen/Qwen2.5-1.5B base model. It is licensed under Apache 2.0 and was fine-tuned using the trl framework.

Key Capabilities

Instruction Following: Enhanced ability to follow instructions due to fine-tuning on the ultrachat_200k dataset.
Efficient Training: Utilizes flash_attention_2 for optimized attention mechanisms during training, contributing to faster processing.
Conversational AI: Specifically trained on a large-scale chat dataset, making it suitable for dialogue-oriented applications.
Quantization Support: Designed to work with quantization configurations, allowing for potential deployment on resource-constrained environments.

Training Details

The model underwent a single epoch of training with a learning rate of 5e-5 and a max_seq_length of 2048. Key training hyperparameters included bf16 precision and a warmup_ratio of 0.1. The training process resulted in a final training loss of 1.192 and an evaluation loss of 1.2003.

Good For

Developing chatbots and virtual assistants.
Applications requiring robust instruction-following in a conversational context.
Research and experimentation with smaller, efficient instruction-tuned models.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)