est-ai/alan-llm-jeju-dialect-v1-4b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Jan 23, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The Alan LLM Jeju Dialect v1 4B model, developed by ESTsoft, is a 4 billion parameter causal language model based on Qwen3-4B. It is specifically fine-tuned using LoRA for bidirectional translation between Jeju dialect and standard Korean, as well as for Jeju dialect conversation generation and QA. This model demonstrates that a lightweight LLM can outperform larger general-purpose models when specialized for a target language, offering robust performance for Jeju dialect-specific tasks.

Loading preview...

Alan LLM Jeju Dialect v1 4B: Specialized for Jeju Dialect

Developed by ESTsoft, the Alan LLM Jeju Dialect v1 4B is a 4 billion parameter causal language model built upon the Qwen3-4B architecture. It has undergone LoRA fine-tuning with Jeju dialect data, enabling it to excel in tasks requiring understanding and generation of the Jeju dialect.

Key Capabilities

  • Bidirectional Translation: Converts text between Jeju dialect and standard Korean.
  • Jeju Dialect Conversation: Generates natural conversations with a Jeju dialect speaker persona.
  • Jeju Dialect QA: Provides answers to questions posed in Jeju dialect.

Performance and Specialization

This model showcases that a 4B parameter specialized LLM can surpass larger, general-purpose models (20B-120B parameters) in Jeju dialect generation, as evidenced by LLM Judge evaluations where it achieved a 96.2% win rate against the base Qwen3-4B. Despite its specialization, the model maintains strong performance on general Korean benchmarks, with scores like 0.5729 on CLICK and 0.5481 on HAERAE. It is optimized for a 32768-token context length and recommends specific generation settings (temperature=0.4, repetition_penalty=1.1) for optimal and stable output.

Limitations

Users should be aware of potential performance asymmetry when used for non-Jeju dialect tasks, data biases from training data, and that it may not cover all regional variations of the Jeju dialect. Like all generative models, it is also susceptible to hallucination.