Markr-AI/Gukbap-Gemma2-9B

TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Aug 29, 2024Architecture:Transformer0.0K Cold

Markr-AI/Gukbap-Gemma2-9B is a 9 billion parameter Ko-Gemma2-9B model developed by HumanF-MarkrAI, fine-tuned for Korean language tasks. It achieves the highest score among Korean-based LLMs on the LogicKor evaluation (8.77), surpassing GPT-4 in this benchmark. This model is notable for being trained exclusively on a proprietary dataset generated using open-source models, avoiding data derived from private LLMs. It is optimized for general-purpose Korean language understanding and generation, demonstrating strong performance across reasoning, writing, and coding categories.

Loading preview...

Overview

Markr-AI/Gukbap-Gemma2-9B is a 9 billion parameter Ko-Gemma2-9B model developed by HumanF-MarkrAI, fine-tuned specifically for the Korean language. It is built upon google/gemma-2-9b-it and features a context length of 8192 tokens. A key differentiator of this model is its training methodology: it was developed using a proprietary dataset generated solely through open-source models, thereby avoiding potential terms of service violations associated with using data from private LLMs like GPT-4.

Key Capabilities & Performance

This model has achieved an impressive 8.77 overall score on the LogicKor evaluation, which is the highest among Korean-based LLMs. This score places its performance on par with Google's Gemini-1.5 and in a similar range to OpenAI's GPT-4-Turbo for Korean language tasks. Specific strengths include:

  • Reasoning: Achieved 9.57 on LogicKor's reasoning sub-score.
  • Writing: Scored 9.64 in writing tasks.
  • Coding: Demonstrated strong performance with a 9.50 score in coding.
  • Understanding: Scored 9.71 in understanding.

Training Methodology

The Gukbap-Series LLM, including this model, was developed using data processing and supervised fine-tuning (SFT) methods inspired by LIMA and WizardLM. The training dataset, named "Wizard-Korea-Datasets," was created using microsoft/WizardLM-2-8x22B via DeepInfra, employing an "Evolving system" approach. This demonstrates the potential to create robust, general-purpose LLMs using only open-source generated datasets.

Good For

  • Applications requiring high-performance Korean language understanding and generation.
  • Developers seeking an open-source friendly LLM solution that avoids reliance on data from proprietary models.
  • Tasks involving Korean reasoning, creative writing, and coding assistance.