winglian/Llama-3-8b-64k-PoSE

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 24, 2024Architecture:Transformer0.1K Warm

winglian/Llama-3-8b-64k-PoSE is an 8 billion parameter Llama 3 model that extends the original 8K context length to 64K using the PoSE (Position-enhanced Sequence Extension) method. This model was further pre-trained on 300 million tokens from the RedPajama V1 dataset, specifically focusing on data between 6K-8K tokens. It is designed for commercial and research use, particularly for tasks requiring a significantly larger context window than the base Llama 3 model.

Loading preview...

Llama 3 8B 64K PoSE: Extended Context Language Model

This model, developed by winglian, is an 8 billion parameter variant of Meta's Llama 3, specifically engineered to overcome the original 8K token context length limitation. It leverages the PoSE (Position-enhanced Sequence Extension) method to achieve an impressive 64K context window by setting rope_theta to 500000.0, with potential for further extension to 2M.

Key Capabilities & Features

  • Extended Context Window: Significantly increases the effective context length from Llama 3's native 8K to 64K, enabling processing of much longer documents and conversations.
  • Continued Pre-training: Fine-tuned on 300 million tokens from the RedPajama V1 dataset, focusing on long-context data (6K-8K tokens) to optimize performance with the extended context.
  • Llama 3 Foundation: Inherits the robust architecture and general language understanding capabilities of the Meta Llama 3 8B model, which is optimized for dialogue and outperforms many open-source chat models on common benchmarks.
  • Instruction-Tuned Variants: The base Llama 3 models are available in instruction-tuned versions, optimized for assistant-like chat, while pretrained models can be adapted for various natural language generation tasks.

Good for Use Cases

  • Long Document Analysis: Ideal for tasks requiring comprehension and generation based on extensive texts, such as legal documents, research papers, or large codebases.
  • Extended Conversations: Suitable for chatbots or virtual assistants that need to maintain coherence and context over very long dialogue turns.
  • Research and Commercial Applications: Intended for both research and commercial use in English, offering a powerful foundation for various NLP applications requiring deep contextual understanding.
  • Fine-tuning for Specific Domains: Developers can fine-tune this model for specialized applications that benefit from its large context window, adhering to the Llama 3 Community License.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p