Osama2/mirage-qwen3-4b-text

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 1, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Osama2/mirage-qwen3-4b-text is a 4 billion parameter causal language model derived from Qwen/Qwen3-VL-4B-Instruct, with its vision tower removed for text-only applications. This model is specifically fine-tuned for VSP Spatial Planning tasks, achieving 86.5% accuracy on its test set when integrated with its original vision component. It features a 32K context length and utilizes a unique "Mirage latent thinking" output format, making it suitable for specialized text generation where structured, pre-answer processing is beneficial.

Loading preview...

Mirage-Qwen3-4B Text-Only Overview

Osama2/mirage-qwen3-4b-text is a specialized 4 billion parameter language model, an export of the Mirage Stage-2 checkpoint. It is based on Qwen/Qwen3-VL-4B-Instruct but has had its vision tower removed, making it a text-only model. This allows it to load with standard AutoModelForCausalLM and vLLM text backends without requiring a Vision-Language processor.

Key Capabilities & Features

  • Text-Only Operation: Optimized for text-based tasks, leveraging the language model weights from its Qwen3-VL base.
  • VSP Spatial Planning: The original VL checkpoint achieved 86.5% accuracy (346/400) on the VSP Spatial Planning test set, indicating its strong foundation for structured reasoning.
  • Mirage Latent Thinking: Employs a unique output format with a short latent prefix (e.g., <|latent_start|><|latent_pad|><|latent_end|>) before its answer. This prefix needs to be stripped during parsing, with a provided Python snippet for convenience.
  • Standard Loading: Compatible with transformers AutoModelForCausalLM and vLLM for straightforward deployment.
  • Apache-2.0 License: Inherits the permissive Apache-2.0 license from its base model.

Good For

  • Applications requiring a compact 4B parameter model for text generation.
  • Use cases where the "Mirage latent thinking" output format can be leveraged for structured responses or internal processing.
  • Tasks that benefit from a model with a strong foundation in spatial planning and structured reasoning, even in a text-only context.