Name: mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz1024_lr2e5_epochs5 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Model Overview

This model, mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz1024_lr2e5_epochs5, is a fine-tuned variant of the Qwen/Qwen2.5-1.5B-Instruct base model. It features approximately 1.5 billion parameters and has been specifically adapted through further training on the mlfoundations-dev/openthoughts3_100k dataset.

Training Details

The fine-tuning process involved a learning rate of 2e-05 over 5.0 epochs, utilizing a cosine learning rate scheduler with a 0.1 warmup ratio. The training was distributed across 32 devices with a total batch size of 1024 (achieved with a gradient_accumulation_steps of 8), using the ADAMW_TORCH optimizer. This targeted training on a specific dataset suggests an optimization for tasks and data distributions present within mlfoundations-dev/openthoughts3_100k.

Key Characteristics

Base Model: Qwen/Qwen2.5-1.5B-Instruct
Parameter Count: 1.5 billion
Fine-tuning Dataset: mlfoundations-dev/openthoughts3_100k
Training Hyperparameters: Optimized with specific learning rate, batch size, and optimizer settings for focused adaptation.

Potential Use Cases

This model is likely best suited for applications where its fine-tuning on the mlfoundations-dev/openthoughts3_100k dataset provides a distinct advantage, potentially excelling in tasks related to the content and structure of that specific data.

Overview

Model Overview

Training Details

Key Characteristics

Potential Use Cases

Full Model Card (README)