Name: brucethemoose/CapyTessBorosYi-34B-200K-DARE-Ties API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: brucethemoose

Model Overview

This model, brucethemoose/CapyTessBorosYi-34B-200K-DARE-Ties, is a 34 billion parameter language model built upon the Yi architecture, supporting a substantial 32768 token context length. It is a composite model, created by merging three distinct finetunes: Nous-Capybara-34B, migtissera/Tess-M-v1.3, and bhenrym14/airoboros-3_1-yi-34b-200k. The merging process employed an experimental "dare ties" method, which is noted for achieving better perplexity and high-context results compared to traditional TIES merges.

Key Capabilities & Features

Advanced Merging Technique: Utilizes an experimental "dare ties" implementation, as detailed in the paper "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch" (GitHub, MergeKit Dare branch).
High Context Window: Inherits the 200K (32768 token) context capability from its base Yi model, making it suitable for processing and generating very long texts.
Optimized Performance: Demonstrates improved perplexity and high-context performance over previous merge configurations.
Orca-Vicuna Prompt Format: Designed to work with the Orca-Vicuna prompt template for instruction following.

Usage Considerations

Yi-Specific Behavior: Users may need to disable the BOS token or use lower temperatures with MinP to manage Yi's tendency to run "hot."
Stop Token Handling: The model might spell out </s> as a stop token, requiring it to be added as an explicit stopping condition.
Hardware Recommendations: Can run 34B models at 45K-75K context on 24GB GPUs using exllamav2, with specific exl2 quantizations available for story writing tasks.

Overview

Model Overview

Key Capabilities & Features

Usage Considerations

Full Model Card (README)