Name: nvidia/Nemotron-Cascade-14B-Thinking API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: nvidia

Overview

Nemotron-Cascade-14B-Thinking is a 14 billion parameter model developed by NVIDIA, post-trained from the Qwen3-14B Base model. It leverages a unique sequential and domain-wise reinforcement learning (Cascade RL) pipeline, which includes a multi-stage Supervised Fine-Tuning (SFT) phase followed by RLHF and domain-specific RLVR stages. This model is exclusively designed for the "thinking" mode, focusing on enhancing complex reasoning abilities.

Key Capabilities

Best-in-Class Reasoning: Achieves top performance across diverse benchmarks, including general-knowledge reasoning, mathematical reasoning, competitive programming, and software engineering.
Advanced Code Performance: Demonstrates superior results in code-related tasks, notably outperforming DeepSeek-R1-0528 (671B) across LiveCodeBench v5, v6, and Pro benchmarks.
Reinforcement Learning Optimization: Utilizes a sophisticated Cascade RL process that significantly boosts complex reasoning without degrading performance in earlier domains.
Long Context Support: Recommends RoPE scaling with YaRN method, allowing context lengths up to 90K tokens for specific tasks like SWE Verified (Agentless) and 64K for others.

Good for

Complex Reasoning Tasks: Ideal for applications requiring deep analytical and problem-solving capabilities.
Code Generation and Software Engineering: Highly effective for competitive programming and software development tasks due to its strong performance in code benchmarks.
Mathematical Problem Solving: Excels in advanced mathematical reasoning, as evidenced by its AIME benchmark scores.
Instruction Following: Shows strong performance in alignment and instruction following benchmarks like IFBench.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)