teknium/Llama-3.1-AlternateTokenizer

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jul 23, 2024License:llama3.1Architecture:Transformer0.0K Warm

The teknium/Llama-3.1-AlternateTokenizer is an 8 billion parameter instruction-tuned large language model developed by Meta, part of the Llama 3.1 collection. This model utilizes an optimized transformer architecture with Grouped-Query Attention and supports a 128k context length. It is optimized for multilingual dialogue use cases, excelling in assistant-like chat and code generation across supported languages like English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

Loading preview...

Overview

This model is the 8 billion parameter instruction-tuned variant from Meta's Llama 3.1 collection, designed for multilingual text-in/text-out generative tasks. It leverages an optimized transformer architecture with Grouped-Query Attention (GQA) and boasts an extended context length of 128k tokens. The model was trained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023, and fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) for alignment with human preferences.

Key Capabilities

  • Multilingual Support: Optimized for dialogue in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, with potential for other languages through fine-tuning.
  • Extended Context Window: Features a 128k token context length, enabling processing of longer inputs and generating more extensive responses.
  • Instruction Following: Instruction-tuned for assistant-like chat and various natural language generation tasks.
  • Code Generation: Demonstrates strong performance in code generation benchmarks like HumanEval (72.6 pass@1 for 8B Instruct).
  • Tool Use: Shows significant improvements in tool-use benchmarks such as API-Bank (82.6 acc for 8B Instruct).

Good For

  • Multilingual Chatbots: Ideal for building conversational AI agents that operate across multiple languages.
  • Code Assistants: Suitable for applications requiring code generation and understanding.
  • Research and Commercial Use: Intended for a broad range of commercial and research applications, including synthetic data generation and model distillation.
  • Long-Context Applications: Beneficial for tasks requiring processing or generating extensive text, thanks to its 128k context window.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p