Name: SeongryongJung/Qwen3-4B-Chemistry-GRPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SeongryongJung

Model Overview

SeongryongJung/Qwen3-4B-Chemistry-GRPO is a specialized 4 billion parameter language model, fine-tuned from the Qwen/Qwen3-4B base model. Its development focused on enhancing performance in chemistry-related tasks through the application of the GRPO (Generalized Reinforcement Learning from Policy Optimization) method on the chemistry split of a dataset.

Key Capabilities & Performance

This model is specifically optimized for chemistry applications. Its validation performance was measured using the val-aux/sciknoweval/reward/mean@16 metric, achieving a peak of 66.58% at step 100. This indicates its proficiency in handling complex chemistry-specific queries and tasks. The training process involved 100 steps, with performance steadily improving throughout.

Use Cases

Chemistry-specific problem solving: Ideal for tasks requiring deep knowledge in chemistry.
Research and development: Can assist in generating or analyzing chemical information.
Educational tools: Potentially useful for creating chemistry-focused learning resources.

Technical Details

The model weights are the final global_step_100/actor checkpoint, converted from VERL FSDP shards to the Hugging Face format. The fine-tuning process was tracked via a W&B run (run-20260629_124519-qs487q2t).

Overview

Model Overview

Key Capabilities & Performance

Use Cases

Technical Details

Full Model Card (README)