Name: hamishivi/Nemotron-Research-Reasoning-Qwen-1.5B-v2-RLVE API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: hamishivi

Nemotron Research Reasoning Qwen 1.5B v2 RLVE

This model, developed by hamishivi, is a 1.5 billion parameter language model built upon the NVIDIA Nemotron-Research-Reasoning-Qwen-1.5B base. Its key differentiator is the application of RLVE (Reinforcement Learning with Verifiable Environments), a method designed to significantly improve reasoning and problem-solving capabilities. The model has a context length of 32768 tokens.

Key Capabilities

Enhanced Reasoning: Demonstrates improved performance on challenging reasoning benchmarks such as AIME 2024/2025, OMEGA-500, and OlympiadBench.
Problem Solving: Shows better results on BBEH and LiveCodeBench-v6, indicating stronger problem-solving skills.
RLVE Optimization: Leverages a novel reinforcement learning approach to achieve superior analytical and logical deduction abilities compared to its base model.

Good for

Complex Reasoning Tasks: Ideal for applications requiring advanced logical inference, mathematical problem-solving, and analytical thinking.
Research and Development: Suitable for researchers exploring reinforcement learning techniques in language models and verifiable environments.
Benchmarking: Can be used as a strong baseline or comparison model for evaluating new reasoning-focused LLM developments.

For more in-depth information on the RLVE method and training details, refer to the RLVE Paper and the RLVE GitHub Repository.

Overview

Nemotron Research Reasoning Qwen 1.5B v2 RLVE

Key Capabilities

Good for

Full Model Card (README)