Yuqian-Fu/SRFT
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kLicense:mitArchitecture:Transformer0.0K Open Weights Warm

Yuqian-Fu/SRFT is a 7.6 billion parameter language model developed by Yuqian-Fu, utilizing a novel Supervised Reinforcement Fine-Tuning (SRFT) method. This single-stage fine-tuning approach unifies supervised and reinforcement learning paradigms through entropy-aware weighting mechanisms. The model is designed to leverage this integrated fine-tuning for improved performance across various language tasks.

Loading preview...