Name: beberik/Nyxene-v3-11B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: beberik

beberik/Nyxene-v3-11B: A Merged 10.7B Parameter Model

Nyxene-v3-11B is a 10.7 billion parameter language model created by beberik, utilizing a sophisticated merging technique with mergekit. This model is an evolution of Nyxene-v1-11B, incorporating new elements and refined merging strategies.

Key Architectural Details

The model's architecture is a hierarchical merge of four distinct base models:

go-bruins-loyal-piano-11B: A passthrough merge combining specific layer ranges from rwitz/go-bruins-v2 (layers 0-24) and chargoddard/loyal-piano-m7-cdpo (layers 8-32).
neural-marcoroni-11B: Another passthrough merge, integrating layer ranges from AIDC-ai-business/Marcoroni-7B-v3 (layers 0-24) and Intel/neural-chat-7b-v3-3-Slerp (layers 8-32).

These two intermediate merges are then combined using a slerp (spherical linear interpolation) method to form Nyxene-11B. This final merge applies specific weighting parameters (t values) to different tensor types (e.g., lm_head, embed_tokens, self_attn, mlp, layernorm) to fine-tune the model's characteristics. The model uses the ChatML prompt template.

Performance Highlights

Evaluated on the Open LLM Leaderboard, Nyxene-v3-11B achieves an average score of 70.72. Notable benchmark results include:

AI2 Reasoning Challenge (25-Shot): 69.62
HellaSwag (10-Shot): 85.33
MMLU (5-Shot): 64.75
TruthfulQA (0-shot): 60.91
Winogrande (5-shot): 80.19
GSM8k (5-shot): 63.53

Use Cases

This model is suitable for general-purpose language generation and understanding tasks, particularly where a balance of reasoning, common sense, and factual recall is beneficial, as indicated by its diverse benchmark performance.

Overview

beberik/Nyxene-v3-11B: A Merged 10.7B Parameter Model

Key Architectural Details

Performance Highlights

Use Cases

Full Model Card (README)