Name: mlfoundations-dev/oh-dcft-v3.1-gemini-1.5-flash API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mlfoundations-dev

Model Overview

The mlfoundations-dev/oh-dcft-v3.1-gemini-1.5-flash is an 8 billion parameter language model, derived from the Meta-Llama-3.1-8B architecture. It has undergone fine-tuning on a specific dataset, mlfoundations-dev/oh-dcft-v3.1-gemini-1.5-flash, to adapt its capabilities. During its training, the model achieved a final validation loss of 0.5841.

Training Details

The fine-tuning process utilized several key hyperparameters:

Learning Rate: 5e-06
Batch Size: 8 (train and eval)
Gradient Accumulation Steps: 8, leading to a total effective batch size of 512
Optimizer: ADAMW_TORCH
Epochs: 3.0

Training was conducted across 8 GPUs, using Transformers 4.46.1, Pytorch 2.3.0, Datasets 3.1.0, and Tokenizers 0.20.3. The model's performance was tracked, showing a consistent reduction in validation loss over three epochs.

Overview

Model Overview

Training Details

Full Model Card (README)