Name: jackf857/llama-3-8b-base-ipo-ultrafeedback-4xh200-batch-128-20260428-004616 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jackf857

Model Overview

This model, jackf857/llama-3-8b-base-ipo-ultrafeedback-4xh200-batch-128-20260428-004616, is an 8 billion parameter language model based on the Llama 3 architecture. It is a fine-tuned iteration of the W-61/llama-3-8b-base-sft-ultrachat-8xh200 model, specifically optimized using the HuggingFaceH4/ultrafeedback_binarized dataset. The fine-tuning process aimed to enhance the model's ability to generate preferred responses, as indicated by its training on a preference dataset.

Key Capabilities

Preference Alignment: Fine-tuned with the Ultrafeedback dataset, suggesting improved alignment with human preferences for response quality.
Instruction Following: As a fine-tuned model, it is intended for general instruction-following tasks.
Base Model Performance: Builds upon the Llama 3 8B base model, inheriting its foundational language understanding and generation capabilities.

Training Details

The model was trained for 1 epoch with a learning rate of 5e-07, using a total batch size of 128 across 4 GPUs. Evaluation metrics during training showed a rewards accuracy of 0.6800, indicating its ability to differentiate between chosen and rejected responses in the preference dataset. The training utilized Transformers 4.51.0 and Pytorch 2.3.1+cu121.

Overview

Model Overview

Key Capabilities

Training Details

Full Model Card (README)