flashresearch/FlashResearch-4B-Thinking
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Oct 1, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm
FlashResearch-4B-Thinking is a 4-billion parameter Qwen model developed by flashresearch, distilled from the Tongyi DeepResearch-30B A3B MoE model. It is specifically optimized for web-scale deep research tasks, including browsing, multi-step reasoning, and source-grounded answers. This model is designed for efficient inference, particularly when integrated with the Alibaba-NLP/DeepResearch framework, making it suitable for fast, low-cost agent runs.
Loading preview...