CorticalStack/pastiche-crown-clown-7b-dare-dpo
CorticalStack/pastiche-crown-clown-7b-dare-dpo is a 7 billion parameter language model developed by CorticalStack. It is a DPO fine-tuned version of the pastiche-crown-clown-7b-dare model, utilizing the jondurbin/truthy-dpo-v0.1 dataset. This model is optimized for improved response quality through DPO training, making it suitable for tasks requiring refined and truthful outputs.
Loading preview...
Model Overview
CorticalStack/pastiche-crown-clown-7b-dare-dpo is a 7 billion parameter language model developed by CorticalStack. This model is a Direct Preference Optimization (DPO) fine-tuned variant of the base pastiche-crown-clown-7b-dare model. The fine-tuning process leveraged the jondurbin/truthy-dpo-v0.1 dataset to enhance its performance and alignment.
Key Training Details
- Base Model: CorticalStack/pastiche-crown-clown-7b-dare
- Fine-tuning Method: Direct Preference Optimization (DPO)
- Dataset: jondurbin/truthy-dpo-v0.1
- LoRA Configuration:
r: 16LoRA alpha: 16LoRA dropout: 0.05
- Training Arguments:
Batch size: 4Gradient accumulation steps: 4Optimizer: paged_adamw_32bitMax steps: 200Learning rate: 5e-05Learning rate scheduler type: cosineBeta: 0.1Max prompt length: 1024Max length: 1536
Intended Use
This model is designed for applications where improved response quality and alignment, achieved through DPO fine-tuning, are beneficial. Its training on a 'truthy' dataset suggests potential for tasks requiring factual accuracy and coherent generation.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.