OsakanaTeishoku/Qwen3-4B-Thinking-2507-reasoning-ja-20260328
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 28, 2026License:apache-2.0Architecture:Transformer Open Weights Warm
OsakanaTeishoku/Qwen3-4B-Thinking-2507-reasoning-ja-20260328 is a 4 billion parameter Qwen3 model developed by OsakanaTeishoku. This model was fine-tuned using Unsloth and Huggingface's TRL library, achieving 2x faster training. It is designed for reasoning tasks, specifically with a focus on Japanese language processing, and supports a context length of 32768 tokens.
Loading preview...
Model Overview
OsakanaTeishoku/Qwen3-4B-Thinking-2507-reasoning-ja-20260328 is a 4 billion parameter language model based on the Qwen3 architecture. Developed by OsakanaTeishoku, this model was fine-tuned from unsloth/Qwen3-4B-Thinking-2507 with a specific emphasis on reasoning capabilities in Japanese.
Key Characteristics
- Architecture: Qwen3-based, with 4 billion parameters.
- Training Efficiency: Fine-tuned using Unsloth and Huggingface's TRL library, resulting in a 2x speedup during the training process.
- Context Length: Supports a substantial context window of 32768 tokens.
- Language Focus: Optimized for reasoning tasks, particularly within the Japanese language.
Intended Use Cases
This model is well-suited for applications requiring:
- Japanese Language Reasoning: Tasks that involve understanding and generating logical inferences or responses in Japanese.
- Efficient Deployment: Its 4B parameter size, combined with efficient fine-tuning methods, makes it a candidate for scenarios where computational resources are a consideration.
- Research and Development: As a fine-tuned Qwen3 variant, it can serve as a base for further experimentation in Japanese NLP and reasoning.