Overview
This model, TakaTaka3/Qwen3-4B-Instruct-2507-sft-merged_V2, is a 4 billion parameter language model based on the Qwen3 architecture. It has been fine-tuned by TakaTaka3 from the Qwen/Qwen3-4B-Instruct-2507 base model using QLoRA (4-bit quantization with Unsloth). The fine-tuning process specifically merged the base model with a LoRA adapter (TakaTaka3/qwen3-4b-lora-adapter_V4).
Key Capabilities
- Enhanced Structured Output: The primary objective of this fine-tuning was to significantly improve the model's accuracy in generating structured data formats, including JSON, YAML, XML, TOML, and CSV.
- Chain-of-Thought (CoT) Optimization: During training, loss was applied only to the final assistant output, with intermediate reasoning (Chain-of-Thought) masked. This approach aims to refine the direct output quality for structured tasks.
- Efficient Fine-tuning: Utilizes QLoRA with 4-bit quantization, making the fine-tuning process more memory-efficient while maintaining performance.
Training Configuration Highlights
- Base Model: Qwen/Qwen3-4B-Instruct-2507
- Method: QLoRA (4-bit)
- Max Sequence Length: 2048 tokens (for training)
- Learning Rate: 2e-06
- LoRA Parameters: r=64, alpha=128
- Training Data: The model was trained using the u-10bei/structured_data_with_cot_dataset_512_v2 dataset, which is distributed under the MIT License.
Good For
This model is particularly well-suited for applications requiring reliable and accurate generation of structured data, such as API response generation, data extraction into specific formats, or configuration file creation.