MerlinSafety/Qwen3.5-4B-Safety-Thinking

VISIONConcurrency Cost:1Model Size:4.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Mar 2, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

MerlinSafety/Qwen3.5-4B-Safety-Thinking is a 4.5 billion parameter language model developed by Merlin Research, based on Qwen/Qwen3.5-4B, with a 32768 token context length. It is specifically optimized for structured reasoning quality, strict instruction adherence, and safety-aligned behavior in practical assistant and autonomous agent workflows. This model excels at enhancing robustness against misalignment patterns and adversarial inputs, making it suitable for safety-critical AI applications.

Loading preview...

Model Overview

MerlinSafety/Qwen3.5-4B-Safety-Thinking is a 4.5 billion parameter language model developed by Merlin Research, built upon the Qwen/Qwen3.5-4B base model. It has been rigorously optimized through a LoRA-based Supervised Fine-Tuning (SFT) process to achieve enhanced safety reasoning and controllability.

Key Capabilities

  • Structured Reasoning Quality: Designed to improve step-by-step problem-solving and complex task breakdown.
  • Instruction Adherence: Demonstrates superior ability to follow strict guidelines and constraints within prompts.
  • Safety-Aligned Behavior: Optimized for reliable and safe operation in real-world assistant and autonomous agent applications.
  • Robustness: Increased resistance to common misalignment patterns and adversarial inputs.
  • Native Reasoning Architecture: Supports and normalizes the <think>...</think> format to explicitly separate reasoning from final output.

Good For

  • Building safety-oriented reasoning assistants and chatbots.
  • Tasks requiring strict, constrained instruction-following.
  • Experimentation in AI alignment, safety research, and robustness testing.
  • Agentic workflows demanding predictable and safe autonomous behavior.