abacusai/Llama-3-Giraffe-70B

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Apr 30, 2024License:llama3Architecture:Transformer0.0K Cold

abacusai/Llama-3-Giraffe-70B is a 70 billion parameter language model developed by Abacus.AI, extending the Llama 3 70B architecture. It features an effective context length of approximately 128k tokens, achieved through PoSE and dynamic-NTK interpolation. This model is designed for applications requiring processing and understanding of very long contexts, making it suitable for tasks like document analysis and extended dialogue.

Loading preview...

Llama-3-Giraffe-70B: Extended Context Llama 3 Variant

Abacus.AI presents Llama-3-Giraffe-70B, an adaptation of the Llama 3 70B model engineered for significantly extended context handling. This 70 billion parameter model boasts an effective context length of approximately 128,000 tokens, a substantial increase over the base Llama 3 model.

Key Capabilities & Training Methodology

  • Extended Context Window: Achieves an effective context length of ~128k tokens, enabling processing of much longer inputs and outputs.
  • Advanced Scaling Techniques: Utilizes PoSE (Positional Skip-wise Training) with 5 chunks and a max position ID of 32768, combined with dynamic-NTK interpolation (NTK scale factor of 4), to efficiently manage long contexts.
  • Training Data: Initial training involved ~1 billion tokens, using ~8K long samples sourced from the RedPajama dataset.
  • Evaluation: Performance is assessed using the EasyContext implementation of Needle-in-a-Haystack, evaluating context lengths from 2,000 to 128,000 tokens.

Ideal Use Cases

  • Long Document Analysis: Suited for tasks requiring comprehension across extensive texts, such as legal documents, research papers, or literary works.
  • Extended Conversational AI: Capable of maintaining coherence and context over very long dialogues or multi-turn interactions.
  • Information Retrieval: Effective for extracting specific information from large bodies of text where context is critical.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p