Jianwen/Search-7B-SFT
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Feb 3, 2026License:mitArchitecture:Transformer Open Weights Cold

Jianwen/Search-7B-SFT is a 7.6 billion parameter cold-start checkpoint for search-based reinforcement learning environments, developed by Jianwen. This model specializes in search tasks by distilling successful trajectories into strategic patterns and failed ones into lessons. It features a hierarchical SKILLBANK for organizing knowledge and recursive skill evolution, achieving 10-20% token compression while enhancing reasoning utility for RL agents.

Loading preview...