Name: jackf857/llama-3-8b-base-ipo-ultrafeedback-4xh200-batch-128-rerun API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jackf857

Model Overview

This model, jackf857/llama-3-8b-base-ipo-ultrafeedback-4xh200-batch-128-rerun, is an 8 billion parameter language model. It is a fine-tuned iteration of the W-61/llama-3-8b-base-sft-ultrachat-8xh200 base model, specifically optimized through training on the HuggingFaceH4/ultrafeedback_binarized dataset.

Key Characteristics

Base Model: Derived from W-61/llama-3-8b-base-sft-ultrachat-8xh200.
Fine-tuning Dataset: Utilizes the HuggingFaceH4/ultrafeedback_binarized dataset, suggesting an emphasis on instruction following and preference alignment.
Training Objective: The fine-tuning process aimed to improve response quality, as indicated by the use of a feedback-based dataset.
Performance Metrics: During evaluation, the model achieved a rewards accuracy of 0.6880 and a rewards margin of 0.0202, with a final validation loss of 2344.3516.

Training Details

The model was trained with a learning rate of 5e-07, a total batch size of 128 (across 4 GPUs with 8 gradient accumulation steps), and a cosine learning rate scheduler with a 0.1 warmup ratio over 1 epoch. The training utilized the AdamW optimizer.

Intended Use Cases

Given its fine-tuning on a feedback dataset, this model is likely suitable for applications requiring improved instruction adherence and generation of preferred responses, such as chatbots, content generation, and summarization tasks where response quality and alignment are important.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)