Name: daichira/dpo-qwen-cot-merged API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: daichira

Overview

The daichira/dpo-qwen-cot-merged is a 4 billion parameter language model developed by daichira. It is a finetuned variant of the unsloth/qwen3-4b-instruct-2507-unsloth-bnb-4bit model, leveraging the Qwen3 architecture. The model was specifically trained for enhanced efficiency, achieving a 2x faster training speed through the integration of Unsloth and Huggingface's TRL library.

Key Characteristics

Base Model: Qwen3-4B-Instruct
Parameter Count: 4 billion parameters
Context Length: 40960 tokens
Training Optimization: Utilizes Unsloth and Huggingface TRL for accelerated finetuning.
License: Apache-2.0

Intended Use Cases

This model is suitable for a variety of general-purpose language tasks, benefiting from its Qwen3 foundation and efficient finetuning. Its optimized training process suggests a focus on delivering strong performance within a 4B parameter footprint, making it a candidate for applications where computational efficiency during development is a priority.