Name: longtermrisk/Qwen3-8B-reward-hacks-first-third API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: longtermrisk

Model Overview

The longtermrisk/Qwen3-8B-reward-hacks-first-third is an 8 billion parameter language model based on the Qwen3 architecture. Developed by longtermrisk, this model was fine-tuned from unsloth/Qwen3-8B.

Key Characteristics

Architecture: Qwen3-8B, a robust base for various NLP tasks.
Training Efficiency: Fine-tuned using Unsloth and Huggingface's TRL library, resulting in a 2x faster training process compared to standard methods.
Context Length: Supports a context length of 32768 tokens, enabling processing of longer inputs and generating more coherent, extended outputs.

Use Cases

This model is suitable for applications requiring a capable 8B parameter model with the efficiency benefits of Unsloth's training optimizations. Its Qwen3 foundation makes it versatile for tasks such as text generation, summarization, question answering, and more, particularly where faster fine-tuning cycles are advantageous.

Overview

Model Overview

Key Characteristics

Use Cases

Full Model Card (README)