Name: mlfoundations-dev/seed_math_multiple_samples_scale_up_scaredy_cat_all API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Model Overview

This model, mlfoundations-dev/seed_math_multiple_samples_scale_up_scaredy_cat_all, is a 7.6 billion parameter language model fine-tuned from the Qwen/Qwen2.5-7B-Instruct base model. It leverages a substantial context length of 131072 tokens, enabling it to process and generate extensive text sequences.

Key Characteristics

Base Model: Fine-tuned from Qwen/Qwen2.5-7B-Instruct.
Parameter Count: 7.6 billion parameters.
Context Length: Supports a large context window of 131072 tokens.
Training Data: Fine-tuned on the mlfoundations-dev/seed_math_multiple_samples_scale_up_scaredy_cat_all dataset, indicating a potential focus on mathematical or complex reasoning tasks.

Training Details

The model was trained using the following key hyperparameters:

Learning Rate: 1e-05
Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08.
Batch Size: A total training batch size of 96 (1 per device with 12 gradient accumulation steps across 8 GPUs).
Epochs: 3.0 epochs.
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.

Potential Use Cases

Given its fine-tuning dataset and large context window, this model is likely suitable for applications requiring:

Processing and understanding long documents or complex problem descriptions.
Tasks involving mathematical reasoning or data analysis, depending on the specifics of the seed_math_multiple_samples_scale_up_scaredy_cat_all dataset.
Applications where the ability to maintain coherence over extended text is crucial.