Name: dfurman/HermesBagel-34B-v0.1 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: dfurman

HermesBagel-34B-v0.1 Overview

HermesBagel-34B-v0.1 is a 34 billion parameter language model developed by dfurman, created through a strategic merge of two powerful base models: NousResearch/Nous-Hermes-2-Yi-34B and jondurbin/bagel-dpo-34b-v0.2. This merge was executed using the LazyMergekit tool, employing a slerp method with specific parameter configurations for self-attention and MLP layers to optimize performance.

Key Capabilities & Performance

This model is designed to offer strong general-purpose language understanding and generation. Its performance has been evaluated on the Open LLM Leaderboard, where it achieved an average score of 75.15. Specific benchmark results include:

AI2 Reasoning Challenge (25-Shot): 70.56
HellaSwag (10-Shot): 85.74
MMLU (5-Shot): 77.38
TruthfulQA (0-shot): 67.34
Winogrande (5-shot): 84.61
GSM8k (5-shot): 65.28

These scores indicate proficiency in reasoning, common sense, language understanding, and mathematical problem-solving.

Good For

General-purpose text generation: Creating coherent and contextually relevant text for a wide range of applications.
Reasoning tasks: Its strong performance on ARC and MMLU suggests suitability for tasks requiring logical inference and knowledge application.
Instruction following: As a merged model incorporating instruction-tuned components, it is likely to perform well in responding to diverse prompts.
Research and experimentation: Provides a solid base for further fine-tuning or exploring merged model architectures.

Overview

HermesBagel-34B-v0.1 Overview

Key Capabilities & Performance

Good For

Full Model Card (README)