Name: crestf411/Q2.5-32B-Slush API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: crestf411

Model Overview

crestf411/Q2.5-32B-Slush is a 32.8 billion parameter model built upon the Qwen/Qwen2.5-32B architecture. It undergoes a unique two-stage training process: an initial pretraining continuation to boost creativity and writing, followed by a fine-tuning stage to further enhance roleplaying capabilities. This model is particularly optimized for generating engaging and creative narrative content.

Key Capabilities

Enhanced Creativity and Writing: The first training stage focuses on improving the model's ability to generate imaginative and diverse text.
Strong Roleplaying Performance: Fine-tuned specifically for roleplaying scenarios, aiming to provide more immersive and consistent character interactions.
High Context Length: Supports a context window of 131072 tokens, allowing for extended and complex conversations or narratives.
LoRA Dropout Training: Utilizes high LoRA dropout (0.5) during training, which can contribute to better generalization and creativity.

Training Details

The model's development involved two distinct stages:

Stage 1 (Continued Pretraining): Targeted Qwen/Qwen2.5-32B, merging a LoRA into Qwen/Qwen2.5-32B-Instruct. This stage used LoRA dropout 0.5, rank 32, alpha 64, and LoRA+ with an LR Ratio of 15, over 1 epoch with an 8192 context size.
Stage 2 (Fine-tuning): Built upon the Stage 1 model, this stage further refined its capabilities using similar LoRA parameters but with a context size of 16384 and a slightly different learning rate schedule.

Usage Considerations

The model was tested with specific parameters (temp 1, min-p 0.1, DRY 0.8, XTC enabled for longer contexts).
Users may need to implement stopping strings like "\nYou" and enable "trim incomplete sentences" to mitigate tendencies for the model to speak for the user in narrator scenarios.
It may occasionally add a summary-like final paragraph in roleplay responses, which can be managed but is an ongoing area for improvement.

Overview

Model Overview

Key Capabilities

Training Details

Usage Considerations

Full Model Card (README)