grimjim/Llama-3-Instruct-8B-SimPO-SPPO-Iter3-merge
The grimjim/Llama-3-Instruct-8B-SimPO-SPPO-Iter3-merge is an 8 billion parameter instruction-tuned language model based on the Meta Llama 3 architecture, created by grimjim. This model is a merge of princeton-nlp/Llama-3-Instruct-8B-SimPO and UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3, utilizing the SLERP merge method. It is designed for general text generation tasks, with evaluation results available on the Open LLM Leaderboard for various benchmarks including IFEval and BBH.
Loading preview...
Overview
This model, grimjim/Llama-3-Instruct-8B-SimPO-SPPO-Iter3-merge, is an 8 billion parameter instruction-tuned language model built upon the Meta Llama 3 architecture. It was created by merging two base models: princeton-nlp/Llama-3-Instruct-8B-SimPO and UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3, using the SLERP merge method via mergekit.
Key Capabilities & Performance
- Merged Architecture: Combines the strengths of two distinct Llama 3 instruction-tuned models.
- Text Generation: Primarily designed for text generation tasks.
- Evaluated Benchmarks: Performance metrics are available on the Open LLM Leaderboard, including:
- IFEval (0-Shot): 68.06 strict accuracy
- BBH (3-Shot): 29.07 normalized accuracy
- MATH Lvl 5 (4-Shot): 6.19 exact match
- MMLU-PRO (5-shot): 29.83 accuracy
When to Use This Model
- General Instruction Following: Suitable for tasks requiring a merged Llama 3 instruction-tuned base.
- Exploration of Merged Models: Ideal for developers interested in the performance characteristics of models created via SLERP merging of specific Llama 3 variants.
- Benchmarking: Can be used for further evaluation against the provided Open LLM Leaderboard results.