Name: koutch/short_paper_qwen_2.json_train_dpo_v2_train_no_think API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: koutch

Model Overview

The koutch/short_paper_qwen_2.json_train_dpo_v2_train_no_think is a 4 billion parameter instruction-tuned language model based on the Qwen3 architecture. Developed by koutch, this model was fine-tuned from unsloth/Qwen3-4B-Instruct-2507.

Key Characteristics

Architecture: Qwen3-based causal language model.
Parameter Count: 4 billion parameters.
Training Efficiency: Fine-tuned using Unsloth and Huggingface's TRL library, which facilitated a 2x faster training process.
License: Released under the Apache-2.0 license.

Intended Use Cases

This model is suitable for a variety of general instruction-following tasks, benefiting from its Qwen3 foundation and efficient fine-tuning. Its 4 billion parameters make it a capable option for applications requiring a balance between performance and computational resources.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)