Name: CorticalStack/mistral-7b-jondurbin-truthy-dpo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CorticalStack

CorticalStack/mistral-7b-jondurbin-truthy-dpo Overview

This model is a 7 billion parameter language model, derived from the foundational Mistral-7B-v0.1 architecture. Its primary distinction lies in its fine-tuning process, which utilizes Direct Preference Optimization (DPO) on the specific jondurbin/truthy-dpo-v0.1 dataset. This training methodology aims to improve the model's ability to generate factually consistent and truthful responses.

Key Training Details

Base Model: mistralai/Mistral-7B-v0.1
Fine-tuning Method: Direct Preference Optimization (DPO)
Dataset: jondurbin/truthy-dpo-v0.1
LoRA Configuration:
- r: 16
- LoRA alpha: 16
- LoRA dropout: 0.05
Training Arguments:
- Batch size: 4
- Gradient accumulation steps: 4
- Optimizer: paged_adamw_32bit
- Max steps: 100
- Learning rate: 5e-05
- Learning rate scheduler type: cosine
- Beta: 0.1
- Max prompt length: 1024
- Max length: 1536

Intended Use Cases

This model is particularly well-suited for applications where the generation of truthful and aligned content is critical. The DPO fine-tuning with the truthy-dpo dataset suggests an emphasis on reducing factual errors and improving the reliability of generated text, making it a candidate for tasks requiring high factual accuracy.

Overview

CorticalStack/mistral-7b-jondurbin-truthy-dpo Overview

Key Training Details

Intended Use Cases

Full Model Card (README)