In2Training/FILM-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 15, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

FILM-7B is a 7 billion parameter language model developed by In2Training, based on Mistral-7B-Instruct-v0.2, featuring a 32K context window. It is specifically designed to overcome the 'lost-in-the-middle' problem in long-context processing. The model achieves strong performance on long-context tasks while maintaining its short-context capabilities, making it suitable for applications requiring extensive contextual understanding.

Loading preview...

Overview

FILM-7B is a 7 billion parameter large language model (LLM) developed by In2Training, built upon the Mistral-7B-Instruct-v0.2 architecture. Its primary innovation lies in its Information-Intensive (In2) Training method, which enables it to effectively utilize a 32K token context window and mitigate the common 'lost-in-the-middle' problem where models struggle to retrieve information from the middle of long inputs.

Key Capabilities

  • Extended Context Understanding: Designed to overcome the 'lost-in-the-middle' issue, allowing for more reliable information retrieval from very long contexts.
  • Strong Performance on Long-Context Tasks: Achieves state-of-the-art level performance among ~7B size LLMs on real-world long-context tasks.
  • Maintained Short-Context Performance: Ensures that its enhanced long-context abilities do not compromise its performance on standard short-context tasks.
  • Research-Oriented: Developed for research purposes, as detailed in its accompanying paper.

Good For

  • Applications requiring deep understanding and extraction from extensive documents or conversations.
  • Research into long-context LLM behavior and mitigation of 'lost-in-the-middle' effects.
  • Tasks where maintaining performance across both short and long contexts is crucial.