Name: fumikawa/a25-v0005 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: fumikawa

Model Overview

fumikawa/a25-v0005 is a 4 billion parameter language model, fine-tuned by fumikawa from the Qwen/Qwen3-4B-Instruct-2507 base model. This model leverages Direct Preference Optimization (DPO), implemented via the Unsloth library, to align its responses with preferred outputs. It is provided as full-merged 16-bit weights, eliminating the need for adapter loading.

Key Optimizations

The primary objective of this fine-tuning was to significantly improve the model's reasoning capabilities, particularly in generating Chain-of-Thought explanations, and to enhance the overall structured response quality. This was achieved by training on a specific preference dataset (u-10bei/dpo-dataset-qwen-cot) for 1 epoch with a learning rate of 1e-07 and a beta value of 0.1.

Technical Specifications

Base Model: Qwen/Qwen3-4B-Instruct-2507
Fine-tuning Method: DPO
Parameter Count: 4 Billion
Max Sequence Length: 1024 (during DPO training)
Context Length: 40960 tokens (inherited from base model)

Recommended Use Cases

This model is particularly well-suited for applications that require:

Enhanced Reasoning: Generating logical, step-by-step explanations (Chain-of-Thought).
Structured Output: Producing responses that adhere to specific formats or structures.
Instruction Following: Executing complex instructions with improved accuracy and coherence.

Licensing

The model is released under the MIT License, consistent with the terms of its training data. Users must also comply with the original base model's license terms.

Overview

Model Overview

Key Optimizations

Technical Specifications

Recommended Use Cases

Licensing

Full Model Card (README)