Hypa-Llama3.1 8B: Multilingual & Tool-Aware Fine-Tune

Hypa-Llama3.1 8B is an 8.7 billion parameter model from Hypa Intelligence, built upon Meta's Llama 3.1 8B. It's a LoRA-merged supervised fine-tune, inheriting capabilities from prior Hypa-Llama checkpoints and layering new prompt families. The model is distinguished by its focus on multilingual support for 17 languages, including English, French, Spanish, and 14 low-resource Nigerian languages (e.g., Annang, Ebira, Idoma, Igbo, Yoruba), many of which are underrepresented in large-scale fine-tuning corpora.

Key Capabilities

Multilingual Translation: Specializes in translation between English/French/Spanish and the 14 covered low-resource languages.
Language Detection: Accurately identifies all 17 supported languages.
Dictionary-style Explanations: Provides lexical lookups and explanations, supporting both Markdown and strict JSON output modes for programmatic use.
Tool-Awareness: Incorporates tool-calling-style prompting, inheriting Llama 3.1's native structure.
Reasoning Channel: Features an explicit <|think> reasoning channel for translation correction and breakdown, emitting a <think>...</think> block before the final answer.
Instruction Following: Excels at multilingual instruction-following for dialogue tasks.

Training and Performance

The model was trained on 17.0 million examples across multilingual instruction sub-datasets, using LoRA (r=256, α=256) via Unsloth and QLoRA. It demonstrated clean training dynamics with a final training loss of 0.213 and evaluation loss of 0.330. Qualitative observations show meaningful improvements over the base Llama 3.1 8B-Instruct, particularly for the smallest languages where the base model was largely unusable. The training context window was 2,048 tokens, though the config advertises 128K.

Good For

Applications requiring high-quality translation for low-resource languages.
Developing multilingual chatbots or agents that need to understand and generate content in diverse languages.
Tasks involving structured data output (e.g., JSON) for dictionary lookups or programmatic interactions.
As a starting point for further fine-tuning on specialized tasks within the supported languages, or for adapter stacking.
Replacing meta-llama/Llama-3.1-8B-Instruct in pipelines needing improved low-resource language quality.

Overview

Hypa-Llama3.1 8B: Multilingual & Tool-Aware Fine-Tune

Key Capabilities

Training and Performance

Good For

Full Model Card (README)