Name: jackf857/llama-3-8b-base-margin-dpo-4xh100 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jackf857

Model Overview

The jackf857/llama-3-8b-base-margin-dpo-4xh100 is an 8 billion parameter language model based on the Llama 3 architecture. It is a fine-tuned variant of the W-61/llama-3-8b-base-ultrachat-sft-4xh100 model, specifically enhanced through Direct Preference Optimization (DPO).

Key Characteristics

Base Model: Llama 3 8B parameters.
Fine-tuning Method: Utilizes Direct Preference Optimization (DPO) for alignment and quality improvement.
Training Data: Fine-tuned on the HuggingFaceH4/ultrafeedback_binarized dataset, which is designed for preference-based learning.
Training Configuration: Trained with a learning rate of 5e-07, a total batch size of 128, and 1 epoch, using Adam optimizer with cosine learning rate scheduler.

Intended Use Cases

This model is designed for applications requiring a robust 8B parameter language model with enhanced response quality due to its DPO fine-tuning. It is suitable for a range of general-purpose natural language processing tasks, including text generation, summarization, and conversational AI, where aligned and preferred outputs are critical.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)