THU-KEG/DeepDive-4B-SFT
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 12, 2026Architecture:Transformer Warm

THU-KEG/DeepDive-4B-SFT is a 4 billion parameter instruction-tuned model developed by THU-KEG, specifically fine-tuned for deep search agents. This model is designed to enhance robust reinforcement learning by incorporating citation-aware rubric rewards, as detailed in the associated research paper. It specializes in tasks requiring evidence chaining and advanced information retrieval, offering a 32768 token context length for complex queries.

Loading preview...