Name: deveg/day1-train-model API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: deveg

Model Overview

The deveg/day1-train-model is a 0.5 billion parameter instruction-tuned language model developed by deveg. It is based on the Qwen2.5 architecture and was fine-tuned from the unsloth/Qwen2.5-0.5B-Instruct-unsloth-bnb-4bit model.

Key Characteristics

Architecture: Qwen2.5-based, a causal language model.
Parameter Count: 0.5 billion parameters, making it a compact and efficient model.
Context Length: Supports a context window of 32768 tokens.
Training Efficiency: Fine-tuned using Unsloth and Huggingface's TRL library, which enabled 2x faster training compared to standard methods.
License: Released under the Apache-2.0 license.

Intended Use Cases

This model is suitable for various instruction-following tasks, benefiting from its efficient fine-tuning process. Its smaller size and optimized training make it a good candidate for applications where computational resources are a consideration, while still providing robust language understanding and generation capabilities.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)