Yuma42/Llama3.1-IgneousIguana-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:llama3.1Architecture:Transformer0.0K Cold

Yuma42/Llama3.1-IgneousIguana-8B is an 8 billion parameter language model, created by Yuma42, formed by merging Llama3.1-SuperHawk-8B and llama3.1-gutenberg-8B. This model leverages a slerp merge method to combine the strengths of its base models, offering a 32768 token context length. It demonstrates a balanced performance across various benchmarks, including an average score of 31.48 and an IFEval (0-Shot) score of 81.33, making it suitable for general-purpose language tasks.

Loading preview...

Model Overview

Yuma42/Llama3.1-IgneousIguana-8B is an 8 billion parameter language model developed by Yuma42. It is a merged model, combining the architectures of Yuma42/Llama3.1-SuperHawk-8B and nbeerbower/llama3.1-gutenberg-8B using the slerp merge method. This approach aims to integrate the distinct capabilities of its constituent models, providing a versatile tool for various natural language processing applications.

Key Capabilities & Performance

The model's performance has been evaluated on the Open LLM Leaderboard, showcasing its general utility across different tasks. Key benchmark results include:

  • Avg. Score: 31.48
  • IFEval (0-Shot): 81.33
  • BBH (3-Shot): 31.99
  • MATH Lvl 5 (4-Shot): 21.98
  • MMLU-PRO (5-shot): 33.04

These metrics indicate a solid foundation for tasks requiring instruction following, general reasoning, and basic mathematical understanding. The model supports a substantial context length of 32768 tokens, enabling it to process and generate longer sequences of text.

Recommended Use Cases

Given its balanced performance and merged architecture, Llama3.1-IgneousIguana-8B is well-suited for:

  • General-purpose text generation: Creating coherent and contextually relevant text for a wide range of prompts.
  • Instruction following: Executing tasks based on explicit instructions, as suggested by its IFEval score.
  • Exploratory research: Serving as a base model for further fine-tuning or experimentation in various NLP domains.