nvidia/Qwen3-Nemotron-32B-RLBFF
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Oct 12, 2025License:nvidia-open-model-licenseArchitecture:Transformer0.0K Open Weights Cold

The nvidia/Qwen3-Nemotron-32B-RLBFF is a 32 billion parameter large language model developed by NVIDIA, built upon the Qwen/Qwen3-32B foundation. It is fine-tuned using Reinforcement Learning from Binary Flexible Feedback (RLBFF) to enhance the quality of LLM-generated responses in a default thinking mode. This research model excels at generating responses to multi-turn user queries, demonstrating improved performance on benchmarks like Arena Hard V2, WildBench, and MT Bench compared to its base model.

Loading preview...