LargeWorldModel/LWM-Text-Chat-1M

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 7, 2024Architecture:Transformer0.2K Cold

LWM-Text-Chat-1M is a 7 billion parameter open-source auto-regressive language model developed by LargeWorldModel, based on the LLaMA-2 architecture. Trained on a specialized subset of Books3 data with documents exceeding 1 million tokens, this model is designed for text-based chat applications. Its unique training on very long documents makes it suitable for tasks requiring extensive context understanding and generation.

Loading preview...

Overview

LWM-Text-Chat-1M is a 7 billion parameter open-source language model developed by LargeWorldModel, built upon the LLaMA-2 architecture. Trained in December 2023, this auto-regressive model is designed for chat-based interactions. A key differentiator is its training methodology, utilizing a specific subset of Books3 documents, each containing over 1 million tokens, which suggests a focus on handling and generating very long contexts.

Key Capabilities

  • Long Context Processing: Trained on documents exceeding 1 million tokens, indicating potential for advanced long-range dependency understanding.
  • LLaMA-2 Foundation: Benefits from the robust and widely-used LLaMA-2 architecture.
  • Open-Source: Available for community use and development under the LLAMA 2 Community License.

Good For

  • Extended Conversational AI: Ideal for chat applications requiring the model to maintain coherence and context over very long dialogues.
  • Text Generation with Deep Context: Suitable for tasks where understanding and generating text based on extensive preceding information is crucial.
  • Research and Development: Provides a foundation for exploring language models trained on exceptionally long document contexts.