Name: wAI-org/swerl-qwen3-8b-tmax-15k-grpo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wAI-org

Model Overview

The wAI-org/swerl-qwen3-8b-tmax-15k-grpo is an 8 billion parameter language model built upon the Qwen3 architecture. This particular version represents a specific checkpoint (step 500) from the hamishivi/swerl-tmax-15k GRPO (Generative Reinforcement Learning with Policy Optimization) run.

Key Characteristics

Base Model: It is built on hamishivi/sft_qwen3_8b_our_sft, indicating a foundation in a supervised fine-tuned Qwen3 8B model.
Training Origin: This checkpoint is a result of a GRPO run, suggesting it has undergone reinforcement learning-based optimization.
Context Length: The model supports a context length of 32,768 tokens, allowing for processing of substantial input sequences.

Intended Use

This model checkpoint is primarily designated for:

Internal Evaluation: Assessing performance and characteristics within a research context.
Continuation Experiments: Serving as a starting point for further training, fine-tuning, or experimental development.

Overview

Model Overview

Key Characteristics

Intended Use

Full Model Card (README)