Voicelab/trurl-2-13b: A Polish-English Llama 2 Dialogue Model
Voicelab/trurl-2-13b is a 13 billion parameter large language model developed by Voicelab.AI, built upon the Llama 2 architecture. It is part of the Trurl 2 collection, which also includes a 7B parameter variant. This model is uniquely distinguished by its extensive fine-tuning on a substantial dataset of over 1.7 billion tokens, comprising 970,000 conversational samples in both Polish and English.
Key Capabilities & Features
- Bilingual Proficiency: Optimized for understanding and generating text in both Polish and English, making it suitable for diverse linguistic applications.
- Dialogue Optimization: Specifically fine-tuned for assistant-like chat and conversational use cases, leveraging a large corpus of Q&A pairs and dialogue data.
- Extended Context Window: Features a context length of 4096 tokens, allowing for more coherent and contextually aware responses in longer interactions.
- Robust Training Data: Trained on a diverse mix of private and publicly available online data, including Alpaca, Falcon, Dolly 15k, Oasst1, ShareGPT, and Voicelab's proprietary datasets for JSON extraction, sales conversations, and corrected dialogues.
- Improved MMLU Performance: The 13B version, trained with MMLU data, shows a significant MMLU score of 78.35%, outperforming Llama-2-chat 13B's 54.64%.
Intended Use Cases
Trurl 2 is designed for commercial and research applications requiring strong performance in both Polish and English. Its primary strength lies in assistant-like chat and various natural language generation tasks. Developers should adhere to the Llama 2 conversation template for optimal performance in chat modes.
Ethical Considerations
As with all LLMs, Trurl 2 carries inherent risks, including the potential for inaccurate, biased, or objectionable outputs. Users are advised to conduct thorough safety testing and tuning for specific applications, referencing Meta's Responsible Use Guide.