Name: mark-22/dpo-qwen-cot-merged-dataclearn3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: mark-22

Model Overview

mark-22/dpo-qwen-cot-merged-dataclearn3 is a 4 billion parameter, full-merged 16-bit Qwen3 model, uniquely optimized for strict structured data generation (e.g., JSON, YAML, CSV). Developed for the Matsuo Lab LLM Competition, its primary goal is to eliminate conversational noise and maximize format compliance, directly outputting structured data.

Strategic Training Pipeline

This model employs a rigorous data cleaning process during both Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO):

Supervised Fine-Tuning (SFT): Focused on direct mapping from user queries to structured data. System prompts and Chain-of-Thought (CoT) reasoning traces were physically removed from the training data to force immediate final answer output, reducing token waste and parse errors.
Direct Preference Optimization (DPO): Refined output quality and format adherence. Both chosen and rejected pairs were stripped of CoT and system prompts, ensuring preference learning is based strictly on the content and validity of the structured data itself.

Key Characteristics

Full-Merged 16-bit Weights: No adapters required, optimized for immediate response.
No Conversational Filler: Designed to output structured data directly, avoiding phrases like "Here is the JSON...".
Optimized for Format Compliance: Rigorous data cleaning and training specifically target high adherence to structured data formats.
Context Length: Supports a context length of 40960 tokens.

Ideal Use Cases

This model is particularly well-suited for applications requiring:

Reliable JSON/YAML/CSV Generation: When precise and immediate structured output is critical.
Automated Data Extraction: Converting natural language requests into machine-readable formats.
Integration with APIs: Generating structured payloads directly from user input.

Overview

Model Overview

Strategic Training Pipeline

Key Characteristics

Ideal Use Cases

Full Model Card (README)