Name: Jeremmmyyyyy/Qwen-poetry-logprob-no-norm-v3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Jeremmmyyyyy

Model Overview

Jeremmmyyyyy/Qwen-poetry-logprob-no-norm-v3 is a 2 billion parameter language model, fine-tuned from the Qwen3-1.7B base model. It leverages the Transformer Reinforcement Learning (TRL) framework for its training procedure.

Key Differentiator: GRPO Training

This model's primary distinction lies in its training methodology. It was fine-tuned using GRPO (Gradient-based Reward Policy Optimization), a technique introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models". This method is specifically designed to improve a model's capabilities in complex reasoning tasks, particularly within mathematical domains.

Technical Specifications

Base Model: Qwen/Qwen3-1.7B
Parameter Count: 2 billion
Context Length: 32768 tokens
Training Frameworks: TRL (version 0.17.0), Transformers (version 4.51.3), Pytorch (version 2.6.0), Datasets (version 3.5.0), Tokenizers (version 0.21.1).

Potential Use Cases

Given its GRPO-based training, this model is likely well-suited for applications requiring:

Mathematical problem-solving: Tasks involving arithmetic, algebra, calculus, or other mathematical reasoning.
Logical deduction: Scenarios where structured, step-by-step reasoning is crucial.
Scientific computing assistance: Generating or interpreting mathematical expressions and solutions.

Developers can quickly integrate this model using the Hugging Face pipeline for text generation tasks.

Overview

Model Overview

Key Differentiator: GRPO Training

Technical Specifications

Potential Use Cases

Full Model Card (README)