Name: wAI-org/swerl-qwen3-8b-endless-terminals-grpo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wAI-org

Model Overview

The wAI-org/swerl-qwen3-8b-endless-terminals-grpo is an 8 billion parameter language model, representing a specific checkpoint (Step 500) from a Generative Reinforcement Learning with Policy Optimization (GRPO) training run. It is built upon the hamishivi/sft_qwen3_8b_our_sft base model.

Key Characteristics

Base Model: Derived from hamishivi/sft_qwen3_8b_our_sft.
Training Method: Result of a GRPO run, specifically hamishivi/agent-task-endless-terminals.
Development Stage: This is a training checkpoint, not a final release model.

Intended Use

Internal Evaluation: Primarily designed for internal assessment of its performance and capabilities.
Continuation Experiments: Suitable for further research and development, serving as a starting point for new experiments.

This model is a specialized artifact from an ongoing research project, focused on agent tasks within an 'endless terminals' context, and is not intended for broad, general-purpose applications.

Overview

Model Overview

Key Characteristics

Intended Use

Full Model Card (README)