Name: etri-xainlp/llama3-8b-dpo_v1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: etri-xainlp

Model Overview

etri-xainlp/llama3-8b-dpo_v1 is an 8 billion parameter language model developed by the ETRI xainlp team. It is built upon the robust Meta-Llama-3-8B base model, enhancing its capabilities through a specialized fine-tuning process.

Key Capabilities

Instruction Following: The model has undergone supervised fine-tuning (SFT) with LoRA using a substantial dataset of 1,821,000 instruction-following examples, significantly improving its ability to understand and execute user commands.
Preference Alignment: Further refinement was achieved through Direct Preference Optimization (DPO) with LoRA, utilizing 221,000 user preference examples. This process aligns the model's outputs more closely with human preferences and desired behaviors.
Text-to-Text Generation: Designed for text-only input and output, making it suitable for a wide range of natural language processing tasks.

Training Details

The fine-tuning process involved a two-stage approach:

SFT + LoRA: Initial training on a large instruction-following dataset.
DPO + LoRA: Subsequent training on a user preference dataset to enhance alignment.

Training was conducted using 8 A100 GPUs with 80GB memory each, indicating a significant computational investment to achieve its current performance.

Overview

Model Overview

Key Capabilities

Training Details

Full Model Card (README)