Name: m-a-p/CriticLeanGPT-Qwen3-14B-RL API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: m-a-p

What is CriticLeanGPT-Qwen3-14B-RL?

m-a-p/CriticLeanGPT-Qwen3-14B-RL is a 14 billion parameter language model built upon the Qwen3 architecture. It has been fine-tuned using Reinforcement Learning (RL) with the CriticLean_4K dataset, which is a subset of the larger CriticLeanInstruct dataset suite. This RL approach aims to align the model for improved performance, particularly in areas requiring critical evaluation and mathematical reasoning.

Key Characteristics

Base Model: Qwen3, a powerful large language model.
Parameter Count: 14 billion parameters.
Context Length: Supports a substantial context window of 32768 tokens.
Training Methodology: Underwent Reinforcement Learning (RL) using the CriticLean_4K dataset, which is specifically designed for critic-guided learning.
Dataset Integration: The CriticLeanInstruct dataset, used for training, incorporates samples from OpenR1-Math-220k and OpenThoughts-114k-Code_decontaminated, indicating a focus on mathematical and coding capabilities.

Ideal Use Cases

This model is particularly well-suited for applications requiring:

Mathematical Formalization: Excels in tasks related to mathematical reasoning and problem-solving due to its RL training on math-centric data.
Code-related Tasks: Benefits from the inclusion of code data in its training, making it capable for code generation or understanding.
Research in RL-based LLM Alignment: Demonstrates an effective application of critic-guided reinforcement learning for model alignment.

Overview

What is CriticLeanGPT-Qwen3-14B-RL?

Key Characteristics

Ideal Use Cases

Full Model Card (README)