allganize/Llama-3-Alpha-Ko-8B-Evo

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 23, 2024License:llama3Architecture:Transformer0.0K Warm

allganize/Llama-3-Alpha-Ko-8B-Evo is an 8 billion parameter language model developed by allganize, built upon the Meta-Llama-3-8B architecture. This 'Evo' model is designed as a base for fine-tuning, excelling in complex language tasks and logical reasoning in both Korean and English through Evolutionary Model Merging. It demonstrates strong performance on benchmarks like LogicKor, making it suitable for diverse and demanding bilingual applications.

Loading preview...

allganize/Llama-3-Alpha-Ko-8B-Evo: An Evolutionary Bilingual Base Model

Alpha-Ko-8B-Evo is an 8 billion parameter language model developed by allganize, leveraging the Evolutionary Model Merging technique. This model serves as a foundational 'Evo' version, intended for further fine-tuning to specific tasks, while its instruction-tuned counterpart, Alpha-Instruct, is recommended for general chat.

Key Capabilities & Development:

  • Bilingual Proficiency: Demonstrates strong capabilities in both Korean and English.
  • Evolutionary Model Merging: Developed by merging Meta-Llama-3-8B, Meta-Llama-3-8B-Instruct, and Llama-3-Open-Ko-8B, using a 1:1 ratio of task-specific datasets from KoBEST and Haerae.
  • Enhanced Human Preference: Utilizes specialized datasets, including Korean-Human-Judgements and Orca-Math, to 'heal' model output and boost human preference scores.
  • Logical Reasoning: Achieved an impressive 6.60 on the LogicKor benchmark, rivaling 70B models in performance, indicating advanced computational and reasoning skills.

Benchmark Highlights:

  • LogicKor: Alpha-Ko-Evo scored 5.190 overall, showcasing strong logical reasoning. The instruction-tuned variant, Alpha-Ko-Instruct, achieved an even higher 6.600.
  • KoBEST (5-shot accuracy): Demonstrated competitive performance, with an overall score of 0.7229, notably excelling in kobest_boolq (0.8547) and kobest_sentineg (0.9471).

This model is ideal for developers looking for a robust bilingual base model that combines strong reasoning with human-aligned output, particularly for applications requiring high performance in Korean and English language tasks.