brucethemoose/Yi-34B-200K-RPMerge

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Feb 7, 2024License:yi-licenseArchitecture:Transformer0.1K Cold

brucethemoose/Yi-34B-200K-RPMerge is a 34 billion parameter language model based on the Yi architecture, specifically merged for enhanced storytelling and long-context instruction following. It integrates several Yi 34B models, focusing on a 40K+ context window, to excel in creative writing, multi-character narratives, and general instruction-following tasks. This model is optimized for applications requiring extended narrative generation and nuanced conversational roleplay.

Loading preview...

Model Overview

RPMerge is a 34 billion parameter model developed by brucethemoose, created by merging several Yi 34B base models. The primary goal of this merge was to produce a model optimized for storytelling, long-context instruction following, and roleplaying, while maintaining a focus on the Vicuna instruction format. It aims to provide robust performance for creative writing and multi-character narratives, leveraging a context window of 40,000 tokens or more.

Key Capabilities

  • Enhanced Storytelling: Specifically designed for generating long, coherent narratives and novel continuations.
  • Instruction Following: Incorporates models with strong general instruction-following performance, adhering primarily to the Orca-Vicuna prompt template.
  • Roleplaying: Includes components trained on roleplaying data, balanced to enhance roleplay without over-emphasizing it.
  • Long Context: Capable of handling contexts of 40K-90K tokens, making it suitable for extended interactions and document analysis.
  • Refusal Mitigation: Gently fine-tuned to discourage refusals in responses.

Good For

  • Creative Writing: Generating stories, novel continuations, and complex narratives.
  • Roleplaying Scenarios: Engaging in multi-character roleplay and interactive fiction.
  • Long-form Content Generation: Tasks requiring analysis or generation over extensive text inputs.
  • General Conversational AI: Providing assistant-style responses with a focus on narrative and instruction adherence.

Technical Details

The model was merged using the DARE TIES method, combining models like DrNicefellow/ChatAllInOne-Yi-34B-200K-V1, migtissera/Tess-34B-v1.5b, cgato/Thespis-34b-v0.7, and Doctor-Shotgun/limarpv3-yi-llama-34b-lora. It is recommended to use specific sampling settings, including quadratic sampling (smoothing factor) and lower temperatures with MinP, for optimal performance, especially given the characteristics of Yi's tokenizer. Efficient high-context inference is best achieved with backends supporting flash attention and 8-bit KV cache, such as exllamav2.