Name: mlfoundations-dev/test_tacc_stratos_verified_mix API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Overview

This model, test_tacc_stratos_verified_mix, is a 7.6 billion parameter language model derived from the Qwen/Qwen2.5-7B-Instruct architecture. It has been specifically fine-tuned by mlfoundations-dev using the mlfoundations-dev/stratos_verified_mix dataset, indicating a specialization towards the characteristics of this particular data mix. The model supports a substantial context length of 32768 tokens.

Key Capabilities

General Language Understanding: Inherits robust language comprehension from its Qwen2.5-7B-Instruct base.
Contextual Processing: Capable of handling long inputs and maintaining coherence over 32768 tokens.
Specialized Adaptation: Fine-tuned on a unique dataset, suggesting potential strengths in areas covered by stratos_verified_mix.

Training Details

The model was trained with a learning rate of 8e-05, a total batch size of 512 (achieved with a train_batch_size of 1 and gradient_accumulation_steps of 16 across 32 GPUs), and for 3 epochs. It utilized the AdamW_Torch optimizer with a cosine learning rate scheduler and a warmup ratio of 0.1. The training leveraged Transformers 4.46.1 and Pytorch 2.5.1.

Overview

Overview

Key Capabilities

Training Details

Full Model Card (README)