Name: mlfoundations-dev/s1K_32b API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: mlfoundations-dev

Overview

mlfoundations-dev/s1K_32b is a 32.8 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-32B-Instruct base model. It has been specifically adapted using the mlfoundations-dev/s1K_reformat dataset, suggesting an optimization for tasks related to data reformatting or structured data processing. The model supports a very large context window of 131072 tokens, enabling it to handle extensive input sequences and maintain context over long interactions.

Training Details

The model was trained with the following key hyperparameters:

Base Model: Qwen/Qwen2.5-32B-Instruct
Dataset: mlfoundations-dev/s1K_reformat
Learning Rate: 1e-05
Optimizer: ADAMW_TORCH with betas=(0.9, 0.95)
Epochs: 5.0
Batch Size: 1 (train), 8 (eval) across 16 devices, resulting in a total effective batch size of 16 (train) and 128 (eval).

Intended Use Cases

Given its fine-tuning on the s1K_reformat dataset, this model is likely best suited for applications involving:

Data Transformation: Tasks requiring reformatting or restructuring of data based on specific patterns.
Structured Data Processing: Handling and generating content that adheres to particular formats.
Long Context Understanding: Leveraging its 131072-token context length for tasks that require processing and generating very long documents or conversations.

Overview

Overview

Training Details

Intended Use Cases

Full Model Card (README)