CorticalStack/neurotic-crown-clown-7b-tak-stack-dpo

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 28, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

CorticalStack/neurotic-crown-clown-7b-tak-stack-dpo is a 7 billion parameter language model developed by CorticalStack. It is a DPO fine-tuned version of neurotic-crown-clown-7b-ties, utilizing the CorticalStack/tak-stack-dpo dataset. This model is optimized through Direct Preference Optimization (DPO) to enhance its performance based on specific preferences. With an 8192 token context length, it is suitable for tasks requiring nuanced understanding and generation.

Loading preview...

neurotic-crown-clown-7b-tak-stack-dpo Overview

CorticalStack/neurotic-crown-clown-7b-tak-stack-dpo is a 7 billion parameter language model that has undergone Direct Preference Optimization (DPO) fine-tuning. This model is built upon the base of CorticalStack/neurotic-crown-clown-7b-ties and leverages the specialized CorticalStack/tak-stack-dpo dataset for its DPO training.

Key Training Details

The DPO fine-tuning process involved specific LoRA (Low-Rank Adaptation) configurations and training arguments:

  • LoRA Parameters:
    • r: 32
    • LoRA alpha: 32
    • LoRA dropout: 0.05
  • Training Arguments:
    • Batch size: 4
    • Gradient accumulation steps: 4
    • Optimizer: paged_adamw_32bit
    • Max steps: 100
    • Learning rate: 5e-05
    • Learning rate scheduler type: cosine
    • Beta: 0.1
    • Max prompt length: 1024
    • Max length: 1536

This DPO-tuned model is designed to exhibit improved performance aligned with the preferences encoded in the tak-stack-dpo dataset, making it suitable for applications where such alignment is critical.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p