OsakanaTeishoku/Qwen3-4B-Thinking-2507-reasoning-ja-20260328

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 28, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

OsakanaTeishoku/Qwen3-4B-Thinking-2507-reasoning-ja-20260328 is a 4 billion parameter Qwen3 model developed by OsakanaTeishoku. This model was fine-tuned using Unsloth and Huggingface's TRL library, achieving 2x faster training. It is designed for reasoning tasks, specifically with a focus on Japanese language processing, and supports a context length of 32768 tokens.

Loading preview...

Model Overview

OsakanaTeishoku/Qwen3-4B-Thinking-2507-reasoning-ja-20260328 is a 4 billion parameter language model based on the Qwen3 architecture. Developed by OsakanaTeishoku, this model was fine-tuned from unsloth/Qwen3-4B-Thinking-2507 with a specific emphasis on reasoning capabilities in Japanese.

Key Characteristics

  • Architecture: Qwen3-based, with 4 billion parameters.
  • Training Efficiency: Fine-tuned using Unsloth and Huggingface's TRL library, resulting in a 2x speedup during the training process.
  • Context Length: Supports a substantial context window of 32768 tokens.
  • Language Focus: Optimized for reasoning tasks, particularly within the Japanese language.

Intended Use Cases

This model is well-suited for applications requiring:

  • Japanese Language Reasoning: Tasks that involve understanding and generating logical inferences or responses in Japanese.
  • Efficient Deployment: Its 4B parameter size, combined with efficient fine-tuning methods, makes it a candidate for scenarios where computational resources are a consideration.
  • Research and Development: As a fine-tuned Qwen3 variant, it can serve as a base for further experimentation in Japanese NLP and reasoning.