Poro 2 70B Base Model Overview
LumiOpen/Llama-Poro-2-70B-base is a 70.55 billion parameter decoder-only transformer model, built upon the Llama 3.1 70B architecture. Developed through a collaboration between AMD Silo AI, the TurkuNLP group, and HPLT, and trained on the LUMI supercomputer, its primary innovation lies in its continued pretraining to efficiently integrate Finnish language capabilities.
Key Capabilities and Features
- Multilingual Proficiency: Substantially outperforms the base Llama 3.1 70B model in Finnish benchmarks while maintaining or slightly improving English performance across various tasks.
- Balanced Training Data: Trained on 165 billion tokens, including a balanced mix of Finnish (30%), English (30%), code (30%), and math (10%) data, ensuring broad utility.
- Enhanced Code Performance: Demonstrates improved performance in code generation, with a HumanEval pass@10 score of 71.34, surpassing Llama 3.1 70B.
- Translation Improvements: Shows significant gains in both English-to-Finnish and Finnish-to-English translation metrics (BLEU and chrF scores).
- Open Source: Released under the Llama 3.1 Community License, promoting accessibility and further development.
Good for
- Finnish Language Applications: Ideal for tasks requiring strong Finnish language understanding and generation.
- Multilingual Systems: Suitable for applications that need to operate effectively in both English and Finnish.
- Research and Development: Serves as a robust base model for further fine-tuning and experimentation in multilingual LLM development.
- Code Generation: Can be utilized for code-related tasks due to its enhanced code performance.
Note: This is a base model and typically requires further fine-tuning for specific end-user applications.