Menlo/Jan-nano-128k

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Jun 25, 2025License:apache-2.0Architecture:Transformer0.2K Open Weights Cold

Menlo/Jan-nano-128k is a 4 billion parameter compact language model developed by Alan Dao and Bach Vu Dinh, designed for research applications. This model features a native 128k context window, enabling efficient processing of extensive documents and complex multi-turn conversations without typical performance degradation. It excels at deep document analysis and multi-document synthesis, making it suitable for research requiring complex reasoning over large information sets. Jan-nano-128k maintains compatibility with Model Context Protocol (MCP) servers and shows improved performance with longer contexts compared to its predecessor.

Loading preview...

Jan-Nano-128k: Extended Context for Deep Research

Jan-Nano-128k, developed by Alan Dao and Bach Vu Dinh, is a 4 billion parameter compact language model specifically engineered for advanced research applications. Building on the Jan-Nano series, this version introduces a native 128k context window, a significant improvement that allows for processing entire research papers, lengthy documents, and complex multi-turn conversations efficiently.

Key Capabilities & Differentiators

  • Native 128k Context Window: Unlike models relying on context extension methods like YaRN, Jan-Nano-128k is built from the ground up to handle long contexts, maintaining performance across the full range without degradation.
  • Enhanced Performance: The model demonstrates improved performance with longer contexts, as evidenced by evaluations on the SimpleQA benchmark, surpassing its predecessor.
  • Deep Research: Optimized for tasks requiring deep document analysis, multi-document synthesis, and complex reasoning over large information sets.
  • MCP Compatibility: Fully compatible with Model Context Protocol (MCP) servers.

Why Choose Jan-Nano-128k?

Traditional context extension often leads to performance issues as context length increases. Jan-Nano-128k addresses this by offering a native long context, making it ideal for scenarios where comprehensive understanding of extensive textual data is critical. This model is particularly suited for academic and industrial research environments that demand robust processing of large volumes of information.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p