winglian/llama-3-8b-256k-PoSE

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 26, 2024Architecture:Transformer0.0K Cold

The winglian/llama-3-8b-256k-PoSE model is an 8 billion parameter Llama 3 variant that utilizes PoSE (Position Interpolation with Rotary Embeddings) to extend its context length from 8K to 256K tokens. Developed by winglian, this model builds upon a 64K context model with additional pretraining on 75 million tokens from SlimPajama. It is primarily designed for applications requiring significantly extended context windows, enabling processing of much longer inputs and generating more coherent, context-aware outputs.

Loading preview...

Overview

This model, winglian/llama-3-8b-256k-PoSE, is an 8 billion parameter Llama 3-based language model that significantly extends its context window. It leverages the PoSE (Position Interpolation with Rotary Embeddings) technique to expand the original Llama 3's 8K context length to an impressive 256K tokens.

Key Capabilities

  • Extended Context Window: Achieves a 256K token context length, a substantial increase over the base Llama 3 8B model, through PoSE and continued pretraining.
  • Llama 3 Foundation: Inherits the robust architecture and general language understanding capabilities of the Meta Llama 3 8B model.
  • Continued Pretraining: Enhanced with 75 million tokens of continued pretraining data from SlimPajama, building on a 64K context model.

Good For

  • Applications requiring processing and understanding of very long documents, codebases, or conversations.
  • Tasks where maintaining context over extensive text is crucial, such as summarization of large texts, long-form content generation, or complex question-answering over vast information.
  • Research and development into extreme context length capabilities for large language models.