Rakushaking/Qwen4b-SFT-d9-merged-after-dpo-toml-xml-yaml-dpo
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 8, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

Rakushaking/Qwen4b-SFT-d9-merged-after-dpo-toml-xml-yaml-dpo is a 4 billion parameter Qwen3-based instruction-tuned language model, fine-tuned with Direct Preference Optimization (DPO) for enhanced structured data generation. It specializes in producing clean, well-formed outputs in formats like TOML, YAML, XML, JSON, and CSV, avoiding common errors like incorrect formatting or extraneous text. This model is optimized for developers requiring reliable and structured data outputs from an LLM.

Loading preview...