CorticalStack/pastiche-crown-clown-7b-dare-dpo

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Mar 2, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

CorticalStack/pastiche-crown-clown-7b-dare-dpo is a 7 billion parameter language model developed by CorticalStack. It is a DPO fine-tuned version of the pastiche-crown-clown-7b-dare model, utilizing the jondurbin/truthy-dpo-v0.1 dataset. This model is optimized for improved response quality through DPO training, making it suitable for tasks requiring refined and truthful outputs.

Loading preview...

Model Overview

CorticalStack/pastiche-crown-clown-7b-dare-dpo is a 7 billion parameter language model developed by CorticalStack. This model is a Direct Preference Optimization (DPO) fine-tuned variant of the base pastiche-crown-clown-7b-dare model. The fine-tuning process leveraged the jondurbin/truthy-dpo-v0.1 dataset to enhance its performance and alignment.

Key Training Details

  • Base Model: CorticalStack/pastiche-crown-clown-7b-dare
  • Fine-tuning Method: Direct Preference Optimization (DPO)
  • Dataset: jondurbin/truthy-dpo-v0.1
  • LoRA Configuration:
    • r: 16
    • LoRA alpha: 16
    • LoRA dropout: 0.05
  • Training Arguments:
    • Batch size: 4
    • Gradient accumulation steps: 4
    • Optimizer: paged_adamw_32bit
    • Max steps: 200
    • Learning rate: 5e-05
    • Learning rate scheduler type: cosine
    • Beta: 0.1
    • Max prompt length: 1024
    • Max length: 1536

Intended Use

This model is designed for applications where improved response quality and alignment, achieved through DPO fine-tuning, are beneficial. Its training on a 'truthy' dataset suggests potential for tasks requiring factual accuracy and coherent generation.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p