Name: Kyleyee/ORPO_hh-seed2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Kyleyee

Overview

Kyleyee/ORPO_hh-seed2 is a 1.5 billion parameter language model developed by Kyleyee, fine-tuned from the Qwen2.5-1.5B-sft-hh-3e base model. It leverages the ORPO (Monolithic Preference Optimization without Reference Model) method, a technique designed to align models with human preferences without requiring a separate reference model. The training utilized the Kyleyee/train_data_Helpful_drdpo_preference dataset and the TRL framework.

Key Capabilities

Preference-aligned generation: Optimized to produce responses that are helpful and aligned with specified preferences.
Efficient fine-tuning: Employs the ORPO method, which simplifies the preference optimization process.
Causal language modeling: Capable of generating coherent and contextually relevant text based on prompts.

Good for

Instruction following: Generating responses that adhere to user instructions and preferences.
Dialogue systems: Creating more helpful and aligned conversational AI outputs.
Research in preference optimization: Exploring the application and effectiveness of the ORPO method in smaller models.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)