Name: yunjae-won/mpq3_llama8b_sft_dpo_beta1e-1_step3072 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: yunjae-won

Model Overview

The yunjae-won/mpq3_llama8b_sft_dpo_beta1e-1_step3072 is an 8 billion parameter language model. The model name indicates it has been developed by yunjae-won and has undergone a training process involving Supervised Fine-Tuning (SFT) followed by Direct Preference Optimization (DPO). The beta1e-1_step3072 suffix suggests a specific DPO configuration with a beta value of 0.1 and training up to step 3072.

Key Characteristics

Parameter Count: 8 billion parameters, placing it in the medium-sized LLM category.
Training Methodology: Utilizes a combination of Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO), a technique often used to align models with human preferences.
Context Length: The model supports a context length of 8192 tokens.

Intended Use Cases

Due to the limited information in the provided model card, specific direct or downstream use cases are not explicitly defined. However, models trained with SFT and DPO are generally aimed at improving instruction following, helpfulness, and safety based on preference data. Developers would need to evaluate its performance on specific tasks to determine suitability. The model's architecture and training suggest potential for general language understanding and generation tasks, with performance characteristics influenced by the specific DPO alignment.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)