nvidia/OpenCodeReasoning-Nemotron-7B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 15, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

nvidia/OpenCodeReasoning-Nemotron-7B is a 7.6 billion parameter large language model developed by NVIDIA, derived from Qwen2.5-7B-Instruct. This model is specifically post-trained for reasoning in code generation tasks, supporting a context length of up to 32,768 tokens. It excels in competitive programming benchmarks like LiveCodeBench and CodeContest, making it suitable for advanced code-related reasoning applications.

Loading preview...

OpenCodeReasoning-Nemotron-7B Overview

OpenCodeReasoning-Nemotron-7B is a 7.6 billion parameter large language model (LLM) developed by NVIDIA, based on the Qwen2.5-7B-Instruct architecture. It is specifically post-trained for reasoning in code generation, making it a specialized tool for complex programming challenges. The model supports an extensive context length of up to 32,768 tokens.

Key Capabilities & Performance

This model demonstrates strong performance in competitive programming benchmarks, as detailed in the OpenCodeReasoning paper. It achieves an average of 51.3 on LiveCodeBench and 18.1 on CodeContest (for the Instruct version), outperforming other distilled 7B+ models like Bespoke-Stratos-7B and OlympicCoder-7B in these metrics. The training corpus for this model is the OpenCodeReasoning dataset, which comprises 736k samples of competitive programming questions and DeepSeek-R1 generated responses.

Intended Use Cases

  • Code Generation with Reasoning: Ideal for tasks requiring logical deduction and problem-solving within code.
  • Competitive Programming: Designed to tackle complex coding challenges effectively.
  • LLM Development and Research: Intended for developers and researchers building advanced language models, particularly those focused on code intelligence.

This model is optimized for NVIDIA GPU-accelerated systems, leveraging hardware like NVIDIA Ampere and Hopper architectures for efficient inference.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p