beomi/Solar-Ko-Recovery-11B

TEXT GENERATIONConcurrency Cost:1Model Size:15BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:May 3, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Solar-Ko-Recovery-11B is an 11 billion parameter auto-regressive language model developed by Junbum Lee (Beomi), based on an optimized Llama-2 derived transformer architecture. This model is specifically designed to enhance the Korean language capabilities of the Solar model family through re-arranged embeddings, an expanded vocabulary, and training on a Korean+English corpus. It excels in Korean language processing, demonstrated by significantly improved tokenization efficiency for Korean text and strong performance on Korean benchmarks.

Loading preview...

Solar-Ko-Recovery-11B: Enhanced Korean Language Model

Solar-Ko-Recovery-11B is an 11 billion parameter auto-regressive language model developed by Junbum Lee (Beomi). It is built upon an optimized transformer architecture derived from Llama-2, with a primary goal of significantly improving the Korean language capabilities of the original Solar model.

Key Enhancements & Capabilities

  • Korean Language Recovery: The model was specifically trained to "recover" Solar's performance on Korean by re-arranging embeddings and the LM head.
  • Expanded Vocabulary: It features an expanded vocabulary (64,000 tokens, up from 32,000 in original Solar) which includes additional Korean and Japanese vocabulary.
  • Efficient Korean Tokenization: Demonstrates significantly improved tokenization efficiency for Korean text. For example, a common Korean phrase tokenizes into 7 tokens with Solar-Ko-Recovery compared to 26 tokens with SOLAR-10.7B, leading to better context utilization and potentially faster inference for Korean.
  • Dual-Language Training: Trained on a curated mix of Korean and English corpora to ensure enhanced representation in both languages.
  • Benchmark Performance: Achieves strong results on Korean-specific benchmarks, including haerae (0.7874 acc_norm), kmmlu_direct (0.4205 exact_match), and various KoBEST tasks (e.g., kobest_boolq 0.9202 acc).

Good for

  • Applications requiring robust Korean language understanding and generation.
  • Use cases where efficient tokenization of Korean text is critical.
  • Developers looking for a performant 11B model with a strong focus on Korean language capabilities.