Azazelle/smol_bruin-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 29, 2023License:cc-by-4.0Architecture:Transformer0.0K Open Weights Cold

Azazelle/smol_bruin-7b is a 7 billion parameter language model created by Azazelle, formed by a Slerp merge of rwitz/go-bruins-v2 and rishiraj/smol-7b. Built on the Mistral-7B-v0.1 base, this model leverages a 4096-token context length. It demonstrates strong general reasoning capabilities, achieving an average score of 71.05 on the Open LLM Leaderboard, making it suitable for a range of general-purpose language tasks.

Loading preview...

Model Overview

Azazelle/smol_bruin-7b is a 7 billion parameter language model developed by Azazelle. This model is a Slerp merge of two distinct models: rwitz/go-bruins-v2 and rishiraj/smol-7b, both based on the mistralai/Mistral-7B-v0.1 architecture. The merge process utilizes specific t values for self-attention and MLP layers, indicating a fine-tuned combination of their respective strengths.

Key Capabilities

  • General Reasoning: Achieves a score of 67.58 on the AI2 Reasoning Challenge (25-Shot).
  • Common Sense Reasoning: Scores 86.48 on HellaSwag (10-Shot) and 81.14 on Winogrande (5-shot).
  • Mathematical Reasoning: Demonstrates capability with a 70.43 score on GSM8k (5-shot).
  • Knowledge & Comprehension: Attains 65.05 on MMLU (5-Shot) and 55.65 on TruthfulQA (0-shot).

Performance Highlights

Evaluated on the Open LLM Leaderboard, smol_bruin-7b achieved an average score of 71.05. This performance positions it as a capable model for various general-purpose applications requiring strong reasoning and language understanding within its 7 billion parameter class.