abacusai/Slerp-CM-mist-dpo
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Jan 3, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

abacusai/Slerp-CM-mist-dpo is a 7 billion parameter language model created by abacusai, formed by a Slerp Merge of cookinai/CatMacaroni-Slerp and mncai/mistral-7b-dpo-v5. This model features an 8192-token context length and demonstrates improved performance in TruthfulQA and GSM8K benchmarks compared to its base models. It is designed for general language understanding and generation tasks, aiming for a balanced performance across various academic benchmarks.

Loading preview...

Model Overview

abacusai/Slerp-CM-mist-dpo is a 7 billion parameter language model developed by abacusai, created through a Slerp Merge of two existing models: cookinai/CatMacaroni-Slerp and mncai/mistral-7b-dpo-v5. This merging strategy aimed to combine the strengths of both base models, particularly focusing on enhancing specific benchmark performances.

Key Capabilities & Performance

This model achieves an average score of 73.1 on the HuggingFace Leaderboard benchmarks. Notably, it shows an improvement in TruthfulQA and GSM8K scores compared to its constituent models, with scores of 62.82 and 72.78 respectively. Other benchmark results include ARC (69.62), HellaSwag (87.09), MMLU (64.81), and Winogrande (81.45). The merge successfully improved TruthfulQA over cookinai/CatMacaroni-Slerp and GSM8K over mncai/mistral-7b-dpo-v5.

Training Details

The model was created using a Slerp merge method, with specific parameter weighting applied to self-attention and MLP layers across the two source models. The base model for the merge was mncai/mistral-7b-dpo-v5.

Limitations

This model has not undergone safety evaluations and is intended solely for research and experimental purposes. Users should be aware of potential biases and risks inherent in large language models.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p