dh82/123456

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Mar 27, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

dh82/123456 is a 4.3 billion parameter instruction-tuned causal language model, based on the Google Gemma-3-4b-it architecture. This model is primarily focused on Korean language processing, leveraging a specialized dataset for its training. With a context length of 32768 tokens, it is designed for text generation tasks within the Korean linguistic domain. Its development aims to provide a foundational model for further specialized applications in Korean.

Loading preview...

Overview

dh82/123456 is a 4.3 billion parameter language model built upon the google/gemma-3-4b-it base architecture. It is specifically instruction-tuned and primarily targets the Korean language, as indicated by its training on the hyokwan/test_data dataset, which includes Korean content. The model supports a substantial context length of 32768 tokens, enabling it to process and generate longer sequences of text.

Key Capabilities

  • Korean Language Processing: Optimized for understanding and generating text in Korean.
  • Text Generation: Capable of various text generation tasks, leveraging its instruction-tuned nature.
  • Extended Context Window: Processes inputs up to 32768 tokens, beneficial for complex or lengthy Korean texts.

Good For

  • Korean-centric Applications: Ideal for use cases requiring robust performance in the Korean language.
  • Instruction-Following Tasks: Suitable for applications where the model needs to adhere to specific instructions for text generation.
  • Research and Development: Serves as a base model for further fine-tuning or experimentation within the Korean NLP domain, with a newer version hyokwan/essential_health also available.