Model Overview
The mlfoundations-dev/open-o1-sft-original-plus-oh-v3.1 is an 8 billion parameter language model, fine-tuned from the mlfoundations-dev/oh-dcft-v3.1-gpt-4o-mini base model. It was specifically trained using the mlfoundations-dev/openo1_sft_original dataset, indicating a focus on supervised fine-tuning for general language tasks. The model supports a substantial context length of 32768 tokens, allowing for processing and generating longer sequences of text.
Training Details
This model underwent 3 epochs of training with a learning rate of 5e-06 and a total batch size of 512 (achieved with a train batch size of 8 and 8 gradient accumulation steps). The training process utilized the AdamW optimizer and a constant learning rate scheduler. During evaluation, the model achieved a final validation loss of 0.5022, demonstrating its performance on the fine-tuning dataset.
Key Characteristics
- Base Model: Fine-tuned from
mlfoundations-dev/oh-dcft-v3.1-gpt-4o-mini. - Parameter Count: 8 billion parameters.
- Context Length: 32768 tokens.
- Fine-tuning Dataset:
mlfoundations-dev/openo1_sft_original. - Validation Loss: Achieved 0.5022 on the evaluation set.
Intended Use Cases
While specific intended uses are not detailed in the provided README, its fine-tuning on a general SFT dataset suggests applicability for a wide range of natural language processing tasks, including text generation, summarization, question answering, and conversational AI, where a robust understanding of language patterns is crucial.