Name: CorticalStack/neurotic-crown-clown-7b-tak-stack-dpo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CorticalStack

neurotic-crown-clown-7b-tak-stack-dpo Overview

CorticalStack/neurotic-crown-clown-7b-tak-stack-dpo is a 7 billion parameter language model that has undergone Direct Preference Optimization (DPO) fine-tuning. This model is built upon the base of CorticalStack/neurotic-crown-clown-7b-ties and leverages the specialized CorticalStack/tak-stack-dpo dataset for its DPO training.

Key Training Details

The DPO fine-tuning process involved specific LoRA (Low-Rank Adaptation) configurations and training arguments:

LoRA Parameters:
- r: 32
- LoRA alpha: 32
- LoRA dropout: 0.05
Training Arguments:
- Batch size: 4
- Gradient accumulation steps: 4
- Optimizer: paged_adamw_32bit
- Max steps: 100
- Learning rate: 5e-05
- Learning rate scheduler type: cosine
- Beta: 0.1
- Max prompt length: 1024
- Max length: 1536

This DPO-tuned model is designed to exhibit improved performance aligned with the preferences encoded in the tak-stack-dpo dataset, making it suitable for applications where such alignment is critical.

Overview

neurotic-crown-clown-7b-tak-stack-dpo Overview

Key Training Details

Full Model Card (README)