Name: mlfoundations-dev/openthoughts API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Model Overview

The mlfoundations-dev/openthoughts is a 7.6 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-7B-Instruct base model. It was specifically trained on the mlfoundations-dev/fig1_all_openthoughts dataset, indicating a specialization derived from this particular data distribution. The model supports a substantial context length of 131072 tokens, allowing it to process and generate longer sequences of text.

Training Details

The fine-tuning process involved a learning rate of 8e-05, with a total training batch size of 512 achieved through a train_batch_size of 1 and gradient_accumulation_steps of 16 across 32 GPUs. The optimizer used was ADAMW_TORCH with standard betas and epsilon, and a cosine learning rate scheduler with a 0.1 warmup ratio was applied over 5 epochs. This configuration suggests a robust training regimen aimed at optimizing performance on the target dataset.

Key Characteristics

Base Model: Qwen/Qwen2.5-7B-Instruct
Parameter Count: 7.6 billion
Context Length: 131072 tokens
Training Dataset: mlfoundations-dev/fig1_all_openthoughts

Intended Use Cases

While specific intended uses are not detailed, its foundation on an instruction-tuned model and large context window suggest suitability for a wide range of natural language processing tasks, including text generation, summarization, question answering, and conversational AI, particularly where long-range dependencies in text are crucial.

Overview

Model Overview

Training Details

Key Characteristics

Intended Use Cases

Full Model Card (README)