Name: flashresearch/FlashResearch-4B-Thinking API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: flashresearch

Overview

FlashResearch-4B-Thinking is a 4-billion parameter Qwen model, a dense architecture, that has been distilled from the larger Tongyi DeepResearch-30B A3B MoE model. This distillation process utilized 33,000 curated deep-research examples from the flashresearch/FlashResearch-DS-33k dataset. The model is primarily intended for integration with the Alibaba-NLP/DeepResearch framework.

Key Capabilities

Web-scale Deep Research: Optimized for tasks requiring extensive information retrieval and synthesis.
Multi-step Reasoning: Designed to handle complex queries involving multiple logical steps.
Source-Grounded Answers: Focuses on providing responses backed by identified sources.
Efficient Inference: Engineered for fast and low-cost operation, suitable for agent-based applications.

Recommended Use

This model is specifically designed to be used directly with the Alibaba-NLP/DeepResearch repository for agent runs. It offers a cost-effective solution for deploying deep research agents, with hardware requirements as low as a single 12-16GB GPU for FP16 inference, or even less with quantization.

Overview

Overview

Key Capabilities

Recommended Use

Full Model Card (README)