Overview
Overview
Unbabel/Tower-Plus-72B is a 72.7 billion parameter multilingual large language model developed by Unbabel, leveraging the Qwen 2.5 72B architecture. It has been enhanced through Continuous Pretraining (CPT), Instruction Tuning (IT), and Weighted Preference Optimization (WPO), integrating extensive parallel and multilingual datasets covering 22 languages. This process imbues the model with strong capabilities in both specialized translation tasks and general instruction-following, including reasoning and code instructions.
Key Capabilities
- Multilingual Translation: Highly proficient in translation-related tasks across 22 languages, as detailed in the paper Tower+: Bridging Generality and Translation Specialization in Multilingual LLMs.
- General Instruction Following: Capable of handling a broad range of general instructions, including reasoning and code-related queries.
- Multilingual Synthetic Data Generation: Effective for creating synthetic data in supported languages, either by translating instructions and answers or generating instructions from seed documents.
- Extended Context Window: Features a substantial context size of 131,072 tokens, with a recommended generation token limit of 8192.
Intended Uses
- Translation Services: Ideal for applications requiring high-quality multilingual translation.
- Data Augmentation: Suitable for generating synthetic multilingual data to enhance training datasets.
- Multilingual NLP Research: A valuable tool for research into multilingual large language models and their applications.