Name: ahmadhehe/tinyllama-1.1b-dpo-hh-rlhf API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ahmadhehe

Model Overview

ahmadhehe/tinyllama-1.1b-dpo-hh-rlhf is a 1.1 billion parameter language model developed by Ahmad Murtaza and Simra Sheikh. It is built upon the TinyLlama-1.1B-Chat-v1.0 base model, which was initially instruction-tuned on the dolly-15k dataset. The key differentiator of this model is its subsequent alignment via Direct Preference Optimization (DPO) using the Anthropic/hh-rlhf dataset, aiming to produce responses that are more aligned with human preferences.

Key Capabilities & Training

Preference Alignment: Utilizes DPO on a comprehensive human preference dataset to enhance response quality and safety.
Base Model: Starts from TinyLlama-1.1B-Chat-v1.0, which provides a strong foundation for chat-based interactions.
Training Configuration: Trained with a beta of 0.5, a learning rate of 5e-05, and 1 epoch, building on an SFT-tuned base (SFT-T4).

Performance Highlights

Evaluation on a 10-prompt test set demonstrates the impact of DPO:

BLEU-4 Score: Achieved 4.2200, significantly outperforming the Base (2.1400) and Best SFT (2.4200) models.
BERTScore F1: Maintained a strong 86.9600, comparable to the Best SFT model (87.1100).

Ideal Use Cases

This model is particularly well-suited for applications requiring:

Chatbots and Conversational AI: Where generating helpful, harmless, and preference-aligned responses is critical.
Preference-Aligned Generation: For tasks where output quality benefits from fine-tuning on human feedback data.
Resource-Constrained Environments: Its 1.1 billion parameters make it a lightweight option for deployment.

Overview

Model Overview

Key Capabilities & Training

Performance Highlights

Ideal Use Cases

Full Model Card (README)