Name: grimjim/cuckoo-starling-32k-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: grimjim

grimjim/cuckoo-starling-32k-7B Overview

This 7 billion parameter model, developed by grimjim, is a merged language model created using the SLERP method. It combines two base models: grimjim/Mistral-Starling-merge-trial1-7B and grimjim/kukulemon-7B. A key feature is its manually adjusted RoPE theta (down to 100K), which aims to balance performance for long context queries with narrative coherence, supporting a 32K token context window.

Key Capabilities & Performance

The model has been lightly tested with ChatML and natively supports Alpaca prompts. It demonstrates solid performance across standard benchmarks, as evaluated on the Open LLM Leaderboard:

Average Score: 69.93
AI2 Reasoning Challenge (25-Shot): 66.81
HellaSwag (10-Shot): 85.97
MMLU (5-Shot): 64.88
TruthfulQA (0-shot): 59.03
Winogrande (5-shot): 80.11
GSM8k (5-shot): 62.77

When to Use This Model

This model is suitable for applications requiring:

General-purpose text generation and understanding with a focus on maintaining narrative coherence over extended contexts.
Tasks benefiting from a 32K token context window, such as summarizing long documents or engaging in extended conversations.
Exploration of merged model capabilities, particularly those derived from Mistral-based architectures.
Use cases compatible with ChatML or Alpaca prompting formats.

Overview

grimjim/cuckoo-starling-32k-7B Overview

Key Capabilities & Performance

When to Use This Model

Full Model Card (README)