Name: brucethemoose/Yi-34B-200K-DARE-merge-v5 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: brucethemoose

Overview

This model, brucethemoose/Yi-34B-200K-DARE-merge-v5, is a 34 billion parameter large language model built upon the Yi architecture, notable for its substantial 200,000 token context window. It represents an advanced merge of multiple high-performing Yi-based finetunes, including Nous-Capybara-34B, Tess-M-v1.4, Airoboros-3_1-yi-34b-200k, PlatYi-34B-200K-Q, Pallas-0.4, Yi-34B-200K-AEZAKMI-v2, and a small contribution from SUS-Chat-34B. The merge was performed using an experimental "dare ties" implementation via mergekit, a technique explored in the "Language Models are Super Mario" paper, aiming to absorb abilities from homologous models.

Key Capabilities & Features

Extended Context Window: Supports up to 200,000 tokens, making it suitable for processing and generating very long texts.
Merged Intelligence: Combines the strengths of several specialized Yi finetunes, potentially enhancing its general reasoning and conversational abilities.
Optimized for Yi: Recommendations for running include using a lower temperature (0.02-0.1 MinP) and a slight repetition penalty, as Yi models tend to run "hot."
Hardware Efficiency: Can run 45K-75K context on 24GB GPUs using exllamav2 and UIs like exui.
Benchmark Performance: Achieves an average score of 71.98 on the Open LLM Leaderboard, with notable scores in MMLU (77.22) and HellaSwag (85.54).

Usage Notes

The model uses an Orca-Vicuna prompt template (SYSTEM: {system_message}\nUSER: {prompt}\nASSISTANT:).
Users might need to add </s> as an additional stopping condition, as the model can sometimes spell out the stop token.
For full-context backends like transformers, max_position_embeddings in config.json must be lowered from 200,000 to avoid Out-Of-Memory errors.

Overview

Overview

Key Capabilities & Features

Usage Notes

Full Model Card (README)