Name: mlfoundations-dev/openthoughts3_1k_llama3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Overview

This model, openthoughts3_1k_llama3, is an 8 billion parameter language model derived from Meta's Llama-3.1-8B-Instruct architecture. It has been fine-tuned specifically on the mlfoundations-dev/openthoughts3_1k dataset, indicating a specialized adaptation to the characteristics and patterns present within that particular data. The model supports a substantial context length of 32768 tokens, allowing for processing and generating longer sequences of text.

Training Details

The fine-tuning process involved a learning rate of 2e-05, a cosine learning rate scheduler with a 0.1 warmup ratio, and 7.0 epochs. Training was conducted across 16 GPUs with a total batch size of 96 (achieved with 6 gradient accumulation steps). The optimizer used was ADAMW_TORCH.

Key Characteristics

Base Model: Meta Llama-3.1-8B-Instruct
Parameter Count: 8 Billion
Context Length: 32768 tokens
Fine-tuning Dataset: mlfoundations-dev/openthoughts3_1k

Potential Use Cases

This model is suitable for applications that can leverage the Llama-3.1-8B-Instruct base model's capabilities, further enhanced by the specific fine-tuning on the openthoughts3_1k dataset. Its large context window makes it potentially useful for tasks requiring deep contextual understanding over extended inputs.

Overview

Overview

Training Details

Key Characteristics

Potential Use Cases

Full Model Card (README)