win10/karcher-test-32b
The win10/karcher-test-32b is a 32.8 billion parameter language model created by win10, formed by merging four pre-trained models using the Karcher Mean method. This merge combines models like OpenThinker2-32B, QwQ-32B-abliterated, Snowflake/Qwen-2.5-coder-Arctic-ExCoT-32B, and Qwen/Qwen2.5-Coder-32B-Instruct. It leverages the strengths of its constituent models, particularly those focused on coding, to offer a versatile foundation for various generative AI tasks with a 32768 token context length.
Loading preview...
Overview
win10/karcher-test-32b is a 32.8 billion parameter language model developed by win10, created through a sophisticated merging process. This model utilizes the Karcher Mean merge method, a technique described in the paper "Functionality-Oriented LLM Merging on the Fisher-Rao Manifold," to combine the capabilities of several high-performing base models. The merge aims to synthesize the strengths of its components into a single, more robust model.
Key Capabilities
- Merged Architecture: Combines four distinct 32B parameter models: open-thoughts/OpenThinker2-32B, huihui-ai/QwQ-32B-abliterated, Snowflake/Qwen-2.5-coder-Arctic-ExCoT-32B, and Qwen/Qwen2.5-Coder-32B-Instruct.
- Karcher Mean Method: Employs a specific merging algorithm, suggesting an emphasis on preserving and combining the functional aspects of the constituent models.
- Context Length: Supports a substantial context window of 32768 tokens, enabling processing of longer inputs and generating more coherent, extended outputs.
Good For
- General-purpose generation: Benefits from the diverse capabilities of its merged components.
- Applications requiring extended context: Suitable for tasks that involve processing or generating long texts, given its 32K context window.
- Exploration of merged model performance: Ideal for researchers and developers interested in the practical outcomes of advanced model merging techniques like the Karcher Mean.