mooli/router-sft-merged

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Apr 23, 2026Architecture:Transformer Cold

The mooli/router-sft-merged is a 2 billion parameter language model with a 32768 token context length. Developed by mooli, this model is a general-purpose language model. Its primary use case is for various natural language processing tasks where a smaller, efficient model with a long context window is beneficial.

Loading preview...

Overview

The mooli/router-sft-merged is a 2 billion parameter language model developed by mooli. It features a substantial context length of 32768 tokens, making it suitable for tasks requiring extensive contextual understanding. This model is a general-purpose language model, designed to be adaptable across a range of natural language processing applications.

Key Capabilities

  • General-purpose language understanding: Capable of handling diverse NLP tasks.
  • Extended context window: Supports processing inputs up to 32768 tokens, beneficial for long-form content analysis or generation.
  • Efficient size: At 2 billion parameters, it offers a balance between performance and computational efficiency.

Good For

  • Applications requiring a smaller, more efficient language model.
  • Tasks that benefit from a long context window, such as summarization of lengthy documents, detailed question answering, or maintaining coherence over extended conversations.
  • General natural language processing tasks where a versatile model is needed.