Name: Markr-AI/Gukbap-Gemma2-9B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Markr-AI

Overview

Markr-AI/Gukbap-Gemma2-9B is a 9 billion parameter Ko-Gemma2-9B model developed by HumanF-MarkrAI, fine-tuned specifically for the Korean language. It is built upon google/gemma-2-9b-it and features a context length of 8192 tokens. A key differentiator of this model is its training methodology: it was developed using a proprietary dataset generated solely through open-source models, thereby avoiding potential terms of service violations associated with using data from private LLMs like GPT-4.

Key Capabilities & Performance

This model has achieved an impressive 8.77 overall score on the LogicKor evaluation, which is the highest among Korean-based LLMs. This score places its performance on par with Google's Gemini-1.5 and in a similar range to OpenAI's GPT-4-Turbo for Korean language tasks. Specific strengths include:

Reasoning: Achieved 9.57 on LogicKor's reasoning sub-score.
Writing: Scored 9.64 in writing tasks.
Coding: Demonstrated strong performance with a 9.50 score in coding.
Understanding: Scored 9.71 in understanding.

Training Methodology

The Gukbap-Series LLM, including this model, was developed using data processing and supervised fine-tuning (SFT) methods inspired by LIMA and WizardLM. The training dataset, named "Wizard-Korea-Datasets," was created using microsoft/WizardLM-2-8x22B via DeepInfra, employing an "Evolving system" approach. This demonstrates the potential to create robust, general-purpose LLMs using only open-source generated datasets.

Good For

Applications requiring high-performance Korean language understanding and generation.
Developers seeking an open-source friendly LLM solution that avoids reliance on data from proprietary models.
Tasks involving Korean reasoning, creative writing, and coding assistance.

Overview

Overview

Key Capabilities & Performance

Training Methodology

Good For

Full Model Card (README)