Name: mlfoundations-dev/open-o1-sft-original-plus-oh-v3.1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Model Overview

The mlfoundations-dev/open-o1-sft-original-plus-oh-v3.1 is an 8 billion parameter language model, fine-tuned from the mlfoundations-dev/oh-dcft-v3.1-gpt-4o-mini base model. It was specifically trained using the mlfoundations-dev/openo1_sft_original dataset, indicating a focus on supervised fine-tuning for general language tasks. The model supports a substantial context length of 32768 tokens, allowing for processing and generating longer sequences of text.

Training Details

This model underwent 3 epochs of training with a learning rate of 5e-06 and a total batch size of 512 (achieved with a train batch size of 8 and 8 gradient accumulation steps). The training process utilized the AdamW optimizer and a constant learning rate scheduler. During evaluation, the model achieved a final validation loss of 0.5022, demonstrating its performance on the fine-tuning dataset.

Key Characteristics

Base Model: Fine-tuned from mlfoundations-dev/oh-dcft-v3.1-gpt-4o-mini.
Parameter Count: 8 billion parameters.
Context Length: 32768 tokens.
Fine-tuning Dataset: mlfoundations-dev/openo1_sft_original.
Validation Loss: Achieved 0.5022 on the evaluation set.

Intended Use Cases

While specific intended uses are not detailed in the provided README, its fine-tuning on a general SFT dataset suggests applicability for a wide range of natural language processing tasks, including text generation, summarization, question answering, and conversational AI, where a robust understanding of language patterns is crucial.