Weyaxi/MetaMath-NeuralHermes-2.5-Mistral-7B-Ties

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 5, 2023License:apache-2.0Architecture:Transformer Open Weights Cold

Weyaxi/MetaMath-NeuralHermes-2.5-Mistral-7B-Ties is a 7 billion parameter language model created by Weyaxi, formed by merging MetaMath-Mistral-7B and NeuralHermes-2.5-Mistral-7B using the TIES-merging technique. This model combines the mathematical reasoning capabilities of MetaMath with the general conversational and instruction-following strengths of NeuralHermes. It is designed for tasks requiring both robust mathematical problem-solving and broad language understanding within a 4096-token context window.

Loading preview...

Model Overview

Weyaxi/MetaMath-NeuralHermes-2.5-Mistral-7B-Ties is a 7 billion parameter language model developed by Weyaxi. This model is a product of merging two distinct Mistral-7B based models: meta-math/MetaMath-Mistral-7B and mlabonne/NeuralHermes-2.5-Mistral-7B. The merge was performed using the TIES-merging technique, aiming to combine their respective strengths.

Key Characteristics

  • Merged Architecture: Combines the specialized mathematical reasoning of MetaMath-Mistral-7B with the general-purpose instruction-following and conversational abilities of NeuralHermes-2.5-Mistral-7B.
  • Parameter Count: A 7 billion parameter model, offering a balance between performance and computational efficiency.
  • Context Length: Supports a context window of 4096 tokens.

Merging Details

The TIES-merging process involved specific weight and density contributions from each base model:

  • MetaMath-Mistral-7B: Contributed 0.5 to both weights and density.
  • NeuralHermes-2.5-Mistral-7B: Contributed 0.3 to weights and 0.5 to density.

Ideal Use Cases

This model is particularly well-suited for applications that require a blend of:

  • Mathematical Problem Solving: Leveraging the MetaMath component for numerical and logical reasoning tasks.
  • General Instruction Following: Benefiting from the NeuralHermes component for diverse conversational and instruction-based prompts.
  • Hybrid Applications: Scenarios where both strong analytical capabilities and natural language interaction are crucial.