Model Overview
This model, kikansha-Tomasu/Qwen3-4B-Instruct-2507-sft, is a 4 billion parameter instruction-tuned language model. It was fine-tuned from the base model Qwen/Qwen3-4B-Instruct-2507 using QLoRA (4-bit quantization with Unsloth) and includes the merged full model weights, allowing direct use without loading a separate base model.
Key Capabilities
- Enhanced Structured Output: The primary objective of this fine-tuning was to significantly improve the model's accuracy when generating structured data formats such as JSON, YAML, XML, TOML, and CSV.
- Targeted Training: Loss was specifically applied only to the final assistant output, with intermediate reasoning (Chain-of-Thought) masked during training to focus on the desired output format.
- Efficient Fine-tuning: Utilizes QLoRA with 4-bit quantization, making it efficient for deployment and use.
Training Details
The model was trained for 1 epoch with a learning rate of 1e-06, using a maximum sequence length of 512. The LoRA configuration included r=64 and alpha=128. Training data consisted of several structured data datasets, including u-10bei/structured_data_with_cot_dataset_512_v2 and daichira/structured-3k-mix-sft, all distributed under the MIT License.
Good For
- Applications requiring precise and accurate generation of structured data (e.g., API responses, configuration files, data serialization).
- Tasks where the output format is critical and needs to adhere strictly to specifications like JSON or YAML.
- Developers looking for a compact 4B parameter model optimized for structured output generation.