OpenCodeReasoning-Nemotron-1.1-7B is a 7.6 billion parameter large language model developed by NVIDIA, derived from Qwen2.5-7B-Instruct. It is specifically post-trained for reasoning in code generation tasks, supporting a context length of up to 65,536 tokens. This model demonstrates strong performance in competitive programming benchmarks, making it suitable for advanced code-related reasoning applications.
Overview
NVIDIA's OpenCodeReasoning-Nemotron-1.1-7B is a 7.6 billion parameter large language model, building upon the Qwen2.5-7B-Instruct architecture. It is specifically post-trained and optimized for code generation with a focus on reasoning, supporting a substantial context length of up to 65,536 tokens. The model is designed for commercial and non-commercial use.
Key Capabilities & Performance
- Specialized Code Reasoning: This model excels in competitive programming tasks, demonstrating strong reasoning capabilities for code generation.
- High Performance: Achieves a Pass@1 score of 55.5 on the LiveCodeBench (v5) evaluation, outperforming other distilled 7B+ models like OlympicCoder-7B (40.9) and OpenThinker-7B (25.5).
- Extended Context Window: Supports a context length of up to 65,536 tokens, beneficial for complex coding problems requiring extensive context.
- Optimized for NVIDIA Hardware: Designed and optimized to run efficiently on NVIDIA GPU-accelerated systems, leveraging hardware and software frameworks like CUDA for faster inference.
Use Cases
- Competitive Programming: Ideal for generating solutions to complex coding challenges.
- Code Generation: Suitable for developers and researchers requiring robust code generation capabilities with strong reasoning.
- LLM Development: Intended for developers and researchers building and experimenting with large language models, particularly in the code domain.
Training & Architecture
The model was trained on the OpenCodeReasoning dataset, which comprises competitive programming questions and responses generated by DeepSeek-R1-0528. It utilizes a dense decoder-only Transformer architecture based on Qwen2.5-7B-Instruct.