grimjim/Llama-3-Instruct-8B-SimPO-SPPO-Iter3-merge

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Jun 28, 2024License:llama3Architecture:Transformer0.0K Warm

The grimjim/Llama-3-Instruct-8B-SimPO-SPPO-Iter3-merge is an 8 billion parameter instruction-tuned language model based on the Meta Llama 3 architecture, created by grimjim. This model is a merge of princeton-nlp/Llama-3-Instruct-8B-SimPO and UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3, utilizing the SLERP merge method. It is designed for general text generation tasks, with evaluation results available on the Open LLM Leaderboard for various benchmarks including IFEval and BBH.

Loading preview...

Overview

This model, grimjim/Llama-3-Instruct-8B-SimPO-SPPO-Iter3-merge, is an 8 billion parameter instruction-tuned language model built upon the Meta Llama 3 architecture. It was created by merging two base models: princeton-nlp/Llama-3-Instruct-8B-SimPO and UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3, using the SLERP merge method via mergekit.

Key Capabilities & Performance

  • Merged Architecture: Combines the strengths of two distinct Llama 3 instruction-tuned models.
  • Text Generation: Primarily designed for text generation tasks.
  • Evaluated Benchmarks: Performance metrics are available on the Open LLM Leaderboard, including:
    • IFEval (0-Shot): 68.06 strict accuracy
    • BBH (3-Shot): 29.07 normalized accuracy
    • MATH Lvl 5 (4-Shot): 6.19 exact match
    • MMLU-PRO (5-shot): 29.83 accuracy

When to Use This Model

  • General Instruction Following: Suitable for tasks requiring a merged Llama 3 instruction-tuned base.
  • Exploration of Merged Models: Ideal for developers interested in the performance characteristics of models created via SLERP merging of specific Llama 3 variants.
  • Benchmarking: Can be used for further evaluation against the provided Open LLM Leaderboard results.