Name: longtermrisk/Qwen3-8B-reward-hacks-full API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: longtermrisk

Overview

The longtermrisk/Qwen3-8B-reward-hacks-full is an 8 billion parameter language model developed by longtermrisk. It is a fine-tuned variant of the Qwen3 architecture, specifically optimized for efficient training.

Key Characteristics

Base Model: Finetuned from unsloth/Qwen3-8B.
Training Efficiency: Achieved 2x faster training speeds by utilizing Unsloth and Huggingface's TRL library.
Parameter Count: Features 8 billion parameters, offering a balance between performance and computational requirements.
Context Length: Supports a context length of 32768 tokens.

Use Cases

This model is suitable for applications requiring a Qwen3-based language model that benefits from an efficiently trained foundation. Its 8B parameter size makes it versatile for various natural language processing tasks where faster fine-tuning is a priority.

Overview

Overview

Key Characteristics

Use Cases

Full Model Card (README)