NousResearch/Yarn-Mistral-7b-128k

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Oct 31, 2023License:apache-2.0Architecture:Transformer0.6K Open Weights Cold

NousResearch/Yarn-Mistral-7b-128k is a 7 billion parameter language model developed by NousResearch, extending the Mistral-7B-v0.1 architecture. It is specifically pretrained on long context data using the YaRN extension method, enabling an impressive 128k token context window. This model is optimized for processing and understanding extremely long sequences of text while maintaining strong performance on short-context tasks.

Loading preview...

Nous-Yarn-Mistral-7b-128k: Extended Context Mistral Model

Nous-Yarn-Mistral-7b-128k is a 7 billion parameter language model built upon the Mistral-7B-v0.1 architecture. Developed by NousResearch, this model has been further pretrained for 1500 steps using the YaRN (Yet another RoPE-scaling method) extension, significantly expanding its context window to an impressive 128,000 tokens.

Key Capabilities & Features

  • Extended Context Window: Supports a 128k token context, making it suitable for applications requiring processing of very long documents or conversations.
  • Strong Long-Context Performance: Demonstrates competitive perplexity scores across various long context lengths (8k to 128k tokens), with a perplexity of 2.19 at 128k tokens.
  • Minimal Short-Context Degradation: Benchmarks indicate that extending the context window with YaRN results in only minimal degradation of performance on standard short-context tasks like ARC-c, Hellaswag, MMLU, and Truthful QA, preserving much of the original Mistral-7B's capabilities.
  • Based on Mistral-7B-v0.1: Inherits the robust base performance and efficiency of the Mistral-7B architecture.

Ideal Use Cases

  • Document Analysis: Processing and summarizing extensive reports, legal documents, or research papers.
  • Long-form Content Generation: Creating or understanding lengthy articles, books, or complex narratives.
  • Extended Chatbots/Conversational AI: Maintaining context over very long dialogues without losing coherence.
  • Code Analysis: Handling large codebases or complex programming projects where extensive context is beneficial.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p