Cartinoe5930/TIES-Merging

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 18, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

TIES-Merging by Cartinoe5930 is a 7 billion parameter language model created by merging three Mistral-7B-Instruct-v0.2 based models: Open-Orca/Mistral-7B-OpenOrca, openchat/openchat-3.5-0106, and WizardLM/WizardMath-7B-V1.1. This merge leverages the TIES-merging method to combine their strengths, resulting in a model with a 4096-token context length. It is designed to offer a balanced performance across general instruction following, conversational AI, and mathematical reasoning tasks.

Loading preview...

TIES-Merging: A Composite 7B Language Model

TIES-Merging, developed by Cartinoe5930, is a 7 billion parameter language model built upon the Mistral-7B-Instruct-v0.2 architecture. It was created using the TIES-merging method, combining the strengths of three distinct models:

Key Merged Components

  • Open-Orca/Mistral-7B-OpenOrca: Contributes to general instruction following and conversational capabilities.
  • openchat/openchat-3.5-0106: Enhances dialogue and chat-based interactions.
  • WizardLM/WizardMath-7B-V1.1: Provides specialized capabilities in mathematical problem-solving and reasoning.

This strategic merge aims to create a versatile model that performs well across a range of tasks, from general question answering to more specific mathematical challenges. The model maintains a context length of 4096 tokens and is configured with float16 precision for efficient deployment.

Usage

Developers can easily integrate TIES-Merging into their projects using the Hugging Face transformers library. The provided Python snippet demonstrates how to load the model and tokenizer, apply chat templates, and generate text, making it straightforward to get started with instruction-based prompts.