Name: mlfoundations-dev/openthoughts3_10k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Model Overview

The mlfoundations-dev/openthoughts3_10k is a 7.6 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-7B-Instruct base model. It was developed by mlfoundations-dev and trained specifically on the mlfoundations-dev/openthoughts3_10k dataset. A key technical specification is its substantial context length of 131072 tokens, allowing for processing very long inputs.

Training Details

The model underwent training with the following key hyperparameters:

Learning Rate: 4e-05
Batch Size: 1 (train), 8 (eval)
Gradient Accumulation: 32, leading to a total effective batch size of 128
Optimizer: AdamW with default betas and epsilon
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio
Epochs: 5.0

This fine-tuning process utilized Transformers 4.46.1, Pytorch 2.6.0+cu124, Datasets 3.1.0, and Tokenizers 0.20.3. While specific intended uses and limitations are not detailed in the provided information, its training on a specialized dataset suggests potential strengths in areas aligned with that data's characteristics.

Overview

Model Overview

Training Details

Full Model Card (README)