Name: eventhorizon28/cadforge-grpo-Qwen3-1.7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: eventhorizon28

Model Overview

The eventhorizon28/cadforge-grpo-Qwen3-1.7B is a 2 billion parameter language model, fine-tuned from the base Qwen/Qwen3-1.7B model. This fine-tuning process utilized the TRL library and specifically incorporated the GRPO (Gradient-based Reward Policy Optimization) method.

Key Capabilities and Training

The primary differentiator of this model is its training with GRPO, a technique detailed in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models". This indicates a strong focus on enhancing the model's ability to handle complex mathematical reasoning tasks. The training framework versions include TRL 1.2.0, Transformers 5.7.0.dev0, Pytorch 2.8.0, Datasets 4.8.4, and Tokenizers 0.22.2.

Use Cases

Given its specialized training with GRPO for mathematical reasoning, this model is particularly well-suited for:

Mathematical problem-solving: Tasks that require logical deduction and numerical computation.
Scientific and engineering applications: Where precise mathematical understanding is crucial.
Educational tools: For generating explanations or solutions to mathematical queries.

Users interested in leveraging a compact yet capable model for mathematical reasoning should consider this fine-tuned Qwen3 variant.

Overview

Model Overview

Key Capabilities and Training

Use Cases

Full Model Card (README)