Name: lihaoxin2020/qwen3-4b-refiner-gpt54-rubric-v3-2-rl-lr5e-6-step100 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: lihaoxin2020

Model Overview

The lihaoxin2020/qwen3-4b-refiner-gpt54-rubric-v3-2-rl-lr5e-6-step100 is a 4 billion parameter language model built upon the Qwen3 architecture. It represents a specific checkpoint from a training run, having undergone further refinement using a GRPO (Generalized Reinforcement Learning from Human Feedback with Policy Optimization) method.

Key Characteristics

Base Model: Qwen3-4B architecture.
Refinement: Trained as a GRPO checkpoint, specifically refined from lihaoxin2020/qwen3-4b-refiner-gpt54-ep2.
Optimization Target: The model's training is geared towards performance aligned with the "GPT54 rubric," suggesting a specialization in tasks or evaluations that adhere to this specific set of criteria.
Context Length: Supports a substantial context window of 32768 tokens.

Intended Use Cases

This model is particularly suited for applications where adherence to a specific rubric or set of evaluation guidelines (like the GPT54 rubric) is critical. It can be beneficial for:

Generating responses that conform to predefined quality or style standards.
Refining existing text to better meet specific criteria.
Tasks requiring nuanced understanding and application of a rubric for content creation or evaluation.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)