grimjim/Llama-3-Instruct-8B-SPPO-Iter3-SimPO-merge

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Jun 28, 2024License:llama3Architecture:Transformer0.0K Warm

grimjim/Llama-3-Instruct-8B-SPPO-Iter3-SimPO-merge is an 8 billion parameter instruction-tuned language model built upon the Meta Llama 3 architecture. This model is a merge of princeton-nlp/Llama-3-Instruct-8B-SimPO and UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3, created using the SLERP merge method. It is designed for general text generation tasks, leveraging the combined strengths of its base models.

Loading preview...

Model Overview

This model, grimjim/Llama-3-Instruct-8B-SPPO-Iter3-SimPO-merge, is an 8 billion parameter instruction-tuned language model based on the Meta Llama 3 architecture. It was created by grimjim using the mergekit tool, specifically employing the SLERP merge method.

Merge Details

The model is a strategic merge of two distinct Llama 3-Instruct 8B variants:

  • princeton-nlp/Llama-3-Instruct-8B-SimPO
  • UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3

The merge configuration involved specific weighting for self-attention and MLP layers across the 32 layers of the base models, aiming to combine their respective strengths. This approach allows for a potentially more robust or specialized model than either of its constituents alone.

Performance Highlights

Evaluations on the Open LLM Leaderboard indicate an average score of 20.74. Specific benchmark results include:

  • IFEval (0-Shot): 42.71 strict accuracy
  • BBH (3-Shot): 28.26 normalized accuracy
  • MMLU-PRO (5-shot): 29.17 accuracy

Use Cases

This merged model is suitable for a variety of text generation tasks where a Llama 3-based instruction-following model is desired. Its merged nature suggests potential for balanced performance across different instruction types, making it a versatile choice for general-purpose conversational AI and instruction-based applications.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p