Name: Momoka1010/dpo-qwen-cot-merged API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Momoka1010

Overview

Momoka1010/dpo-qwen-cot-merged is a 4 billion parameter language model derived from the Qwen3-4B-Instruct-2507 base model. It has undergone Direct Preference Optimization (DPO) fine-tuning, a method known for improving model alignment with human preferences and instruction following capabilities.

Key Capabilities

Enhanced Instruction Following: The DPO fine-tuning process aims to produce responses that are more aligned with user instructions and preferences.
Qwen3 Architecture: Leverages the robust architecture of the Qwen3 series, providing a strong foundation for various NLP tasks.
Large Context Window: Supports a context length of 40960 tokens, enabling the processing and generation of longer, more complex texts.

Good For

Applications requiring models with improved instruction adherence.
Tasks benefiting from a large context window, such as summarization of long documents or extended conversational AI.
Developers looking for a Qwen3-based model with enhanced response quality through DPO.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)