Name: THU-KEG/IF-Verifier-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: THU-KEG

Model Overview

THU-KEG/IF-Verifier-7B is a 7.6 billion parameter generative reward model developed by Hao Peng@THUKEG. It is fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Qwen-7B and supports both English and Chinese languages. The primary purpose of this model is to verify soft constraints of instruction following in generated text.

Key Capabilities

Instruction Following Verification: Specifically designed to evaluate the adherence to instructions, acting as a critic model.
Efficiency: Can be deployed on a single H800 GPU, with an average reward computation time of 120 seconds per batch, which can be further optimized with multi-GPU setups.
Performance: Achieves verification results comparable to larger models, specifically noted to be on par with QwQ 32B.
Extensive Context: Features a substantial context length of 131072 tokens.

Training Details

The model was trained using 131,000 critic data points from the IF-Verifier-Data dataset. More detailed information, including the research paper, can be found in the VerIF GitHub repository.

Good For

Developers and researchers focused on reinforcement learning from human feedback (RLHF) or similar alignment techniques.
Applications requiring automated evaluation of instruction adherence in large language model outputs.
Scenarios where efficient, GPU-friendly deployment of a reward model is crucial.

Overview

Model Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)