InnerI/A-I-0xtom-7B-slerp
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Feb 16, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

InnerI/A-I-0xtom-7B-slerp is an 8 billion parameter language model created by InnerI, formed by merging 0x0dad0/nous_nous_v2_0 and tomaszki/nous-thirty using a slerp merge method. This model achieves an average loss of 0.3912 and demonstrates a balanced performance across various reasoning and common sense benchmarks, with a context length of 8192 tokens. It is suitable for general-purpose language generation tasks requiring robust understanding and reasoning capabilities.

Loading preview...

Model Overview

InnerI/A-I-0xtom-7B-slerp is an 8 billion parameter language model developed by InnerI. It is a merged model, combining the strengths of 0x0dad0/nous_nous_v2_0 and tomaszki/nous-thirty using the slerp (spherical linear interpolation) merge method. This approach allows for a nuanced blend of the base models' characteristics, aiming for improved performance.

Key Characteristics

  • Merge Method: Utilizes slerp for combining model weights, with specific parameter adjustments for self-attention and MLP layers.
  • Average Loss: Achieves an average model loss of 0.3912, indicating a good balance in its training and merging process.
  • Context Length: Supports an 8192-token context window, enabling it to handle moderately long inputs and generate coherent, extended outputs.

Performance Benchmarks

Evaluated on the Open LLM Leaderboard, A-I-0xtom-7B-slerp shows competitive performance across several key metrics:

  • Avg. Score: 60.46
  • AI2 Reasoning Challenge (25-Shot): 58.19
  • HellaSwag (10-Shot): 77.64
  • MMLU (5-Shot): 58.74
  • TruthfulQA (0-shot): 54.78
  • Winogrande (5-shot): 73.24
  • GSM8k (5-shot): 40.18

Ideal Use Cases

This model is well-suited for applications requiring:

  • General-purpose text generation: Capable of handling a wide range of prompts and generating coherent responses.
  • Reasoning tasks: Its performance on ARC and MMLU suggests proficiency in logical deduction and knowledge-based reasoning.
  • Common sense understanding: Demonstrated by its scores on HellaSwag and Winogrande.