Name: Nalandadata/nalanda-qwen-7b-grpo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Nalandadata

Nalanda Qwen 2.5 7B GRPO: Specialized for Indian Competitive Exams

Nalanda Qwen 2.5 7B GRPO is a specialized large language model developed by Nalandadata, fine-tuned from the Qwen2.5-7B-Instruct base model. Its primary purpose is to excel at solving multiple-choice questions (MCQs) from Indian competitive exams such as JEE Mains, JEE Advanced, and NEET UG, covering Physics, Chemistry, Mathematics, and Biology.

Key Capabilities & Training Methodology

This model employs a unique two-stage training pipeline:

Stage 1: Light Supervised Fine-Tuning (SFT): Briefly introduced domain vocabulary and question formats using a mix of JEE/NEET questions and general instruction data.
Stage 2: Group Relative Policy Optimization (GRPO): This crucial stage, inspired by advanced research, trained the model to arrive at correct answers through its own reasoning. Unlike standard SFT which can lead to catastrophic forgetting, GRPO rewards correctness, format compliance, and reasoning quality, preserving and enhancing general capabilities.

Performance Highlights

Nalanda Qwen 2.5 7B GRPO shows substantial improvements over the baseline Qwen 2.5 7B model on held-out JEE/NEET MCQs:

Overall Accuracy: Achieves 69.6%, a +9.1 percentage point improvement.
Subject-specific Accuracy: Physics (+14.0pp to 65.0%), Chemistry (+10.0pp to 71.5%), Mathematics (+8.5pp to 64.5%), and Biology (+4.0pp to 77.5%).
Public Benchmark Preservation: Crucially, the model maintains or slightly improves performance on general reasoning benchmarks like GSM8K, ARC-Challenge, and MMLU-Physics/Chemistry, indicating no catastrophic forgetting.

Ideal Use Cases

This model is particularly well-suited for:

EdTech platforms: Generating solutions or explanations for JEE/NEET-style questions.
Automated tutoring systems: Providing step-by-step reasoning for STEM competitive exam problems.
Content creation: Assisting in the development of educational materials for Indian competitive exams.

Overview

Nalanda Qwen 2.5 7B GRPO: Specialized for Indian Competitive Exams

Key Capabilities & Training Methodology

Performance Highlights

Ideal Use Cases

Full Model Card (README)