Name: friday-and-co/Qwen3.5-4B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: friday-and-co

Qwen3.5-4B: Enhanced Generation Configuration

This model, friday-and-co/Qwen3.5-4B, is a 4.5 billion parameter variant of the Qwen3.5 architecture. Its primary distinction from the upstream Qwen/Qwen3.5-4B is the inclusion of a generation_config.json file. This seemingly minor addition addresses a critical issue in multi-turn and tool-use applications.

Key Enhancements:

Correct Stop Token Handling: The added generation_config.json explicitly defines [248046, 248044] as eos_token_id, corresponding to <|im_end|> and <|endoftext|>. This ensures that inference engines correctly recognize both chat turn terminators.
Prevents Runaway Generation: Without this configuration, engines like sglang or vLLM would default to only <|endoftext|> as the stop token, leading to continuous, unwanted generation after a chat turn or tool use prompt.

Ideal Use Cases:

Multi-turn Chatbots: Ensures proper conversation flow and termination after each user or assistant turn.
Tool-use Agents: Facilitates accurate response parsing by stopping generation at the intended end of a tool call or response.
Applications requiring precise generation control: Any scenario where explicit control over generation termination is crucial for correct model behavior.

Overview

Qwen3.5-4B: Enhanced Generation Configuration

Key Enhancements:

Ideal Use Cases:

Full Model Card (README)