Name: wAI-org/swerl-qwen3-8b-termigen-grpo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wAI-org

Model Overview

The wAI-org/swerl-qwen3-8b-termigen-grpo is an 8 billion parameter language model, representing a final checkpoint from a Gradient-based Reinforcement Learning with Policy Optimization (GRPO) training run. It is built upon the hamishivi/sft_qwen3_8b_our_sft base model.

Key Characteristics

Base Model: Derived from hamishivi/sft_qwen3_8b_our_sft.
Training Objective: The model's training focused on "agent-task-termigen" within a GRPO framework, suggesting a specialization in generating terminology or actions relevant to agent-based tasks.
Development Stage: This checkpoint is explicitly designated for internal evaluation and further experimental continuation, indicating it is not a production-ready release but a developmental artifact.
Training Completion: The training for this specific checkpoint was completed on May 18, 2026.

Intended Use

This model is primarily intended for:

Internal Evaluation: Assessing the performance and capabilities of the GRPO training run.
Continuation Experiments: Serving as a starting point for further research and development in agent-task terminology generation or related areas.

Overview

Model Overview

Key Characteristics

Intended Use

Full Model Card (README)