Name: Kyleyee/HINGE_hh-seed3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Kyleyee

Overview

Kyleyee/HINGE_hh-seed3 is a 1.5 billion parameter language model developed by Kyleyee. It is a fine-tuned variant of the Qwen2.5-1.5B-sft-hh-3e base model, specifically optimized for generating helpful responses. The model leverages a substantial 32768 token context length, enabling it to process and generate longer, more coherent texts.

Key Capabilities

Preference-Aligned Generation: The model was trained using Direct Preference Optimization (DPO), a method that aligns language model outputs with human preferences, specifically for helpfulness.
Foundation Model: Built upon the Qwen2.5-1.5B-sft-hh-3e architecture, providing a robust base for language understanding and generation.
Extended Context Window: Supports a context length of 32768 tokens, beneficial for tasks requiring extensive conversational history or detailed input.

Training Details

The model was fine-tuned using the TRL library, a Transformer Reinforcement Learning framework. The training utilized the Kyleyee/train_data_Helpful_drdpo_preference dataset, which is designed for preference-based learning. The DPO method, as introduced in the paper "Direct Preference Optimization: Your Language Model is Secretly a Reward Model," was central to its training process.

Good For

Developing conversational agents that prioritize helpful and aligned responses.
Applications requiring text generation where human preferences for helpfulness are critical.
Research into preference-based fine-tuning methods for smaller language models.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)