allganize/Llama-3-Alpha-Ko-8B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 23, 2024License:llama3Architecture:Transformer0.1K Warm

allganize/Llama-3-Alpha-Ko-8B-Instruct is an 8 billion parameter language model developed by Allganize, built upon the Llama-3-8B architecture using an Evolutionary Model Merging technique. This model demonstrates strong bilingual capabilities in Korean and English, excelling in complex language tasks and logical reasoning. It achieves an impressive 6.62 on the LogicKor benchmark, rivaling the performance of larger 70B models, making it highly efficient for demanding language applications.

Loading preview...

Alpha-Instruct: A Bilingual Llama-3 Model for Korean and English

Alpha-Instruct, developed by Allganize, is an 8 billion parameter language model built on the Llama-3 architecture. It leverages an innovative Evolutionary Model Merging technique, combining Meta-Llama-3-8B, Meta-Llama-3-8B-Instruct, and Llama-3-Open-Ko-8B to achieve strong bilingual performance.

Key Capabilities & Differentiators

  • Exceptional Logical Reasoning: Alpha-Instruct scores 6.62 on the LogicKor benchmark, a performance comparable to 70B models, highlighting its advanced computational and reasoning skills.
  • Bilingual Proficiency: The model demonstrates strong capabilities in both Korean and English, making it suitable for diverse language tasks.
  • Human Preference Optimization: It was refined using high-quality, curated datasets like Korean-Human-Judgements and Orca-Math, and optimized with ORPO for improved human preference scores and real-life applicability.
  • Community-Driven Development: Allganize emphasizes a community-based approach, drawing inspiration from various communities and committing to sharing insights on data, methods, and models.

Benchmark Performance

Alpha-Instruct shows competitive results on Korean benchmarks:

  • LogicKor: Achieved an overall score of 6.62, outperforming other 8B models like MLP-KTLim/llama-3-Korean-Bllossom-8B and Alpha-Ko-Evo.
  • KoBEST: Demonstrates strong performance across various tasks, including kobest_boolq (0.8369) and kobest_sentineg (0.9244), indicating robust understanding and generation capabilities in Korean.

Ideal Use Cases

This model is particularly well-suited for applications requiring:

  • Complex Korean and English language understanding and generation.
  • Tasks demanding strong logical reasoning and problem-solving.
  • Scenarios where efficiency (8B parameters) is crucial without sacrificing high-quality output.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p