dataslab/DLM-NL2JSON-4B

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 19, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

DLM-NL2JSON-4B is a 4-billion parameter Qwen3-4B LoRA-merged model developed by Data Science Lab., Ltd. It is specifically fine-tuned for structured JSON extraction from Korean natural language queries, achieving 94.4% accuracy and outperforming GPT-4o and Qwen3.5-35B on its specialized task. This model excels at converting Korean public/economic data queries into predefined JSON schemas for the Busan Metropolitan City Big Data Wave analytics service.

Loading preview...

Overview

DLM-NL2JSON-4B is a specialized 4-billion parameter model developed by Data Science Lab., Ltd. It is a LoRA-merged Qwen3-4B fine-tuned for extracting structured JSON from Korean natural language queries. This model is designed for a specific production service, the Busan Metropolitan City public data analytics service, and is not a general-purpose NL-to-JSON converter.

Key Capabilities & Performance

This model demonstrates exceptional performance on its target task, achieving 94.4% accuracy (96.8% adjusted) on 2,041 test samples. It significantly outperforms larger models like GPT-4o (80.5%) and Qwen3.5-35B (72.2%) in its domain. DLM-NL2JSON-4B shows particularly strong gains in categories like population patterns (ALP) and credit statistics, winning 8 out of 10 evaluated categories.

Important Considerations:

  • Service-Specific: This model is trained exclusively for a fixed set of predefined schemas and will not generalize to arbitrary JSON schemas or different prompt formats.
  • Strict Usage Requirements: Users must employ the exact system prompts and include corresponding special tokens (e.g., <TASK_CSM>) for correct operation.
  • Korean Only: All training data and prompts are in Korean.

Intended Use

This model is ideal for converting Korean natural language queries about public and economic data into structured JSON, specifically within the context of the Busan Metropolitan City Big Data Wave analytics dashboard. It serves as a reference for the effectiveness of domain-specific fine-tuning for constrained structured output tasks, enabling smaller, more efficient models to surpass general-purpose LLMs in specialized applications.