nvidia/Nemotron-Cascade-8B-Thinking
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 8, 2025License:otherArchitecture:Transformer0.0K Warm

Nemotron-Cascade-8B-Thinking is an 8 billion parameter general-purpose language model developed by NVIDIA, post-trained from Qwen3-8B-Base. It is specifically designed for "thinking" mode tasks, leveraging sequential and domain-wise reinforcement learning to achieve best-in-class performance across various reasoning, alignment, mathematical, and coding benchmarks. This model excels in complex reasoning abilities, making it suitable for applications requiring advanced problem-solving and analytical thought.

Loading preview...