brucethemoose/Yi-34B-200K-DARE-megamerge-v8
brucethemoose/Yi-34B-200K-DARE-megamerge-v8 is a 34 billion parameter language model based on the Yi architecture, specifically designed for exceptional long-context performance up to 200,000 tokens. This model is a DARE TIES merge of multiple Yi 34B 200K models, aiming to excel in 32K+ context scenarios without additional fine-tuning. It is optimized for handling extensive conversational and textual inputs, making it suitable for applications requiring deep contextual understanding.
Loading preview...
Overview
This model, brucethemoose/Yi-34B-200K-DARE-megamerge-v8, is a 34 billion parameter language model built upon the Yi architecture. It leverages the DARE TIES merge method to combine numerous Yi 34B 200K models, with a primary goal of achieving superior performance in long-context scenarios, specifically targeting 32K+ context lengths without further fine-tuning. The merge process involved biasing weight gradients towards Vicuna-format models in initial layers to emphasize the Orca-Vicuna prompt template.
Key Capabilities
- Extended Context Handling: Designed to excel with context lengths exceeding 32,000 tokens, up to its native 200,000 token capacity.
- Merged Intelligence: Combines the strengths of over a dozen different Yi 34B 200K models, including those focused on creative writing, instruction following, and general chat.
- Optimized for Yi Architecture: Recommends specific inference parameters (low temperature, MinP, repetition penalty) to manage Yi's large vocabulary and prevent 'hot' outputs.
- Efficient Deployment: Can run high context (40K-90K) on 24GB GPUs using backends like exllamav2, with aggressive quantization enabling use on 16GB GPUs.
Good For
- Applications requiring deep understanding and generation over very long texts.
- Scenarios where maintaining coherence and context across extensive conversations or documents is crucial.
- Users seeking a robust 34B model with strong long-context capabilities without needing further fine-tuning.