NousResearch/Yarn-Mistral-7b-64k

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Oct 31, 2023License:apache-2.0Architecture:Transformer0.1K Open Weights Cold

NousResearch/Yarn-Mistral-7b-64k is a 7 billion parameter language model developed by NousResearch, based on the Mistral-7B-v0.1 architecture. It has been further pretrained using the YaRN extension method to support an extended context window of 64k tokens. This model is specifically optimized for long-context tasks, demonstrating improved perplexity on longer sequences while maintaining strong performance on short-context benchmarks.

Loading preview...

Overview

Nous-Yarn-Mistral-7b-64k is a 7 billion parameter language model from NousResearch, built upon the Mistral-7B-v0.1 base model. Its primary distinguishing feature is the significant extension of its context window to 64k tokens, achieved through further pretraining using the YaRN (Yet another RoPE-scaling method) extension. This allows the model to process and understand much longer inputs and generate coherent, contextually relevant outputs over extended sequences.

Key Capabilities

  • Extended Context Window: Supports a 64k token context, making it suitable for tasks requiring processing of large documents, codebases, or lengthy conversations.
  • Strong Long-Context Performance: Benchmarks show improved perplexity (PPL) at 16k, 32k, and 64k token contexts compared to the base Mistral-7B-v0.1 model.
  • Minimal Short-Context Degradation: Performance on standard short-context benchmarks (ARC-c, Hellaswag, MMLU, Truthful QA) remains largely comparable to the original Mistral-7B-v0.1, indicating that the long-context extension does not significantly compromise its general capabilities.

When to Use This Model

  • Processing lengthy documents: Ideal for summarization, question answering, or information extraction from long texts.
  • Code analysis and generation: Can handle larger code files or multiple related code snippets within a single context.
  • Extended conversational AI: Suitable for chatbots or agents that need to maintain context over very long interactions.
  • Applications requiring deep contextual understanding: Any task where the ability to reference distant information within a prompt is critical.