friday-and-co/Qwen3.5-9B

VISIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 20, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The friday-and-co/Qwen3.5-9B model is a 9 billion parameter language model based on the Qwen architecture, developed by Qwen. This version specifically addresses a common inference issue by including a `generation_config.json` file, ensuring correct multi-turn generation termination. It is optimized for applications requiring precise control over conversational turn-taking and preventing runaway generation in multi-turn or tool-use scenarios.

Loading preview...

Overview

friday-and-co/Qwen3.5-9B is a 9 billion parameter language model derived from the original Qwen/Qwen3.5-9B checkpoint. Its primary distinction lies in the inclusion of a generation_config.json file, which is absent in the upstream Qwen 4B/9B models.

Key Enhancements

  • Correct Generation Termination: The added generation_config.json explicitly defines both <|im_end|> (248046) and <|endoftext|> (248044) as eos_token_ids. This ensures that inference engines like sglang and vLLM correctly halt generation at the end of each conversational turn.
  • Prevents Runaway Generation: By properly recognizing the chat turn terminator <|im_end|>, this model variant prevents common issues of runaway generation in multi-turn dialogues or tool-use rollouts, which can occur when only <|endoftext|> is used for termination.

Ideal Use Cases

  • Multi-turn Chatbots: Ensures conversations terminate naturally after each model response.
  • Tool-use Scenarios: Facilitates precise control over generation boundaries when integrating with external tools.
  • Applications requiring reliable turn-taking: Any application where explicit control over the end of a model's output in a sequence is critical.