Voicelab/trurl-2-7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Aug 16, 2023Architecture:Transformer0.0K Warm

Voicelab/trurl-2-7b is a 7 billion parameter Llama 2-based auto-regressive language model developed by Voicelab.AI, fine-tuned on over 1.7 billion tokens of conversational Polish and English samples. With a context length of 4096 tokens, this model is specifically optimized for dialogue use cases and natural language generation tasks in both Polish and English. It excels at assistant-like chat functionalities, leveraging a diverse training dataset including private Voicelab datasets for specialized conversational scenarios.

Loading preview...

Voicelab/trurl-2-7b: A Bilingual Llama 2 for Dialogue

Voicelab/trurl-2-7b is a 7 billion parameter model from the Trurl 2 series, developed by Voicelab.AI. This model is a fine-tuned Llama 2 variant, distinguished by its extensive training on over 1.7 billion tokens, comprising 970,000 conversational samples in both Polish and English. It features a substantial context length of 4096 tokens, making it suitable for handling longer dialogue turns.

Key Capabilities

  • Bilingual Proficiency: Optimized for natural language generation and dialogue in both Polish and English.
  • Dialogue Optimization: Specifically fine-tuned for assistant-like chat applications, leveraging a diverse mix of conversational data.
  • Robust Training Data: Trained on a unique blend of private and publicly available online data, including Q&A pairs from sources like Alpaca, Falcon, Dolly 15k, Oasst1, ShareGPT, and specialized Voicelab datasets for JSON extraction, sales conversations, and corrected dialogues.
  • Transformer Architecture: Utilizes an optimized transformer architecture for auto-regressive language modeling.

Good For

  • Assistant-like Chatbots: Ideal for building conversational agents that require strong performance in Polish and English.
  • Natural Language Generation: Suitable for various text generation tasks beyond simple Q&A.
  • Commercial and Research Use: Intended for broad application in both commercial products and academic research, particularly where bilingual capabilities are crucial.

Evaluation results show the 7B model achieving 75.29% on HellaSwag (10-shot), 53.41% on ARC (25-shot), and 50.0% on MMLU (5-shot), demonstrating its general language understanding and reasoning abilities.