agentlans/Llama3.1-Daredevilish-Instruct

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 22, 2025License:llama3.1Architecture:Transformer0.0K Cold

agentlans/Llama3.1-Daredevilish-Instruct is an experimental 8.03 billion parameter Llama 3.1-based merged language model, created by agentlans. It combines top-performing Llama 3.1 8B models on the MMLU-Pro benchmark from the Open LLM Leaderboard as of January 21, 2025. This model is designed for research and development, offering a straightforward approach to leveraging high-performing Llama 3.1 components.

Loading preview...

Overview

agentlans/Llama3.1-Daredevilish-Instruct is an experimental 8.03 billion parameter language model based on the Llama 3.1 architecture. It was created by merging several top-performing Llama 3.1 8B models, specifically those excelling on the MMLU-Pro benchmark as listed on the Open LLM Leaderboard as of January 21, 2025. The merge process utilized mergekit with a dare_ties method, drawing inspiration from the approach used in mlabonne/Daredevil-8B.

Key Characteristics

  • Architecture: Llama 3.1 (8.03B parameters) with a context length of 32768 tokens.
  • Creation Method: Merged from top Llama 3.1 8B models on the MMLU-Pro benchmark without additional fine-tuning.
  • Experimental Nature: Designed primarily for research and development purposes.

Evaluation Results

While experimental, the model's performance on the Open LLM Leaderboard shows an Average score of 29.32%. Specific metrics include:

  • IFEval (0-Shot): 79.41%
  • BBH (3-Shot): 32.22%
  • MMLU-PRO (5-shot): 31.97%

Usage and Limitations

This model is intended for research and development. Users should be aware of potential biases and limitations inherent in language models and are advised to validate outputs and use the model responsibly. Further evaluation and fine-tuning are suggested for optimizing performance across various tasks.