deepkick/qwen3-4b-structured-sft-lora-v07-merged

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 1, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The deepkick/qwen3-4b-structured-sft-lora-v07-merged model is a 4 billion parameter language model, fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using QLoRA. It is specifically optimized for structured data tasks, leveraging a dataset designed for structured data with Chain-of-Thought (CoT) reasoning. This model is particularly suited for applications requiring precise extraction and generation of structured information, maintaining CoT during training.

Loading preview...

Model Overview

deepkick/qwen3-4b-structured-sft-lora-v07-merged is a 4 billion parameter language model, fully merged from its base model and a LoRA fine-tuning. It is built upon the Qwen/Qwen3-4B-Instruct-2507 architecture, indicating a foundation in the Qwen3 series known for its strong performance.

Key Capabilities & Training

This model has been fine-tuned using QLoRA (4-bit), a memory-efficient method for adapting large language models. The training focused on a specific dataset, u-10bei/structured_data_with_cot_dataset_512_v2, which comprises 3933 entries. A notable aspect of its training configuration is the MASK_COT: 1 setting, which means Chain-of-Thought (CoT) reasoning was preserved and masked during loss calculation, suggesting an emphasis on maintaining and utilizing reasoning steps for structured outputs.

Differentiators

The v07 iteration specifically increased the SFT Learning Rate (LR) from 2e-6 to 2e-5 compared to v03, indicating an aggressive learning approach to better capture the nuances of structured data. Other parameters, such as LoRA r=64 and alpha=128, and 2 training epochs, remained consistent. This model is designed for tasks that benefit from structured data processing and explicit reasoning paths.