Unbabel/Tower-Plus-2B

Warm
Public
2.6B
BF16
8192
1
Jun 9, 2025
License: cc-by-nc-sa-4.0
Hugging Face

Unbabel/Tower-Plus-2B is a 2.6 billion parameter multilingual large language model built upon Gemma 2 2B. Developed by Unbabel, it undergoes continuous pretraining, instruction tuning, and weighted preference optimization, incorporating parallel and multilingual data across 22 languages. This model excels in multilingual tasks, particularly machine translation, and supports a context length of 8192 tokens.

Overview

Unbabel/Tower-Plus-2B: Multilingual LLM for Translation and Beyond

Unbabel's Tower-Plus-2B is a 2.6 billion parameter model, leveraging the Gemma 2 2B architecture. It has undergone a rigorous training regimen including Continuous Pretraining (CPT), Instruction Tuning (IT), Weighted Preference Optimization (WPO), and GRPO with verifiable rewards. A key differentiator is its extensive training on parallel and multilingual data, covering 22 languages, making it a leading multilingual LLM under 3 billion parameters.

Key Capabilities

  • Exceptional Multilingual Translation: Specifically strong in machine translation across its supported languages.
  • General Multilingual Instruction Following: Capable of handling various instruction-following tasks in multiple languages, including reasoning and code instructions.
  • Multilingual Synthetic Data Generation: Effective for creating synthetic data by translating instructions and answers or generating instructions from seed documents.
  • Broad Language Support: Covers 22 languages including German, Spanish, French, Italian, Korean, Dutch, Russian, English, Portuguese (Portugal/Brazilian), Chinese (Simplified/Traditional), Czech, Ukrainian, Hindi, Icelandic, Japanese, Polish, Swedish, Hungarian, Romanian, Danish, Norwegian, and Finnish.

Good For

  • Developers requiring a compact yet powerful multilingual model for translation tasks.
  • Applications needing robust multilingual instruction following.
  • Generating high-quality multilingual synthetic data for various NLP tasks.

This model is licensed under CC-BY-NC-4.0 and supports an 8192-token context window.