Weyaxi/MetaMath-NeuralHermes-2.5-Mistral-7B-Linear

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 5, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Weyaxi/MetaMath-NeuralHermes-2.5-Mistral-7B-Linear is a 7 billion parameter language model created by Weyaxi, built by linearly merging MetaMath-Mistral-7B and NeuralHermes-2.5-Mistral-7B. This model is designed for enhanced mathematical reasoning and general instruction following, leveraging the strengths of its base models. It processes inputs with a context length of 4096 tokens, making it suitable for tasks requiring both logical deduction and broad conversational abilities.

Loading preview...

Overview

Weyaxi/MetaMath-NeuralHermes-2.5-Mistral-7B-Linear is a 7 billion parameter language model developed by Weyaxi. It is constructed using a linear merge method, combining two distinct base models: meta-math/MetaMath-Mistral-7B with a weight of 0.5 and mlabonne/NeuralHermes-2.5-Mistral-7B with a weight of 0.3. This merging strategy aims to integrate the specialized mathematical reasoning capabilities of MetaMath with the strong general instruction-following and conversational abilities of NeuralHermes-2.5.

Key Capabilities

  • Enhanced Mathematical Reasoning: Benefits from the MetaMath component, which is specifically trained for mathematical problem-solving.
  • Robust Instruction Following: Inherits strong instruction-following capabilities from the NeuralHermes-2.5 base model.
  • General Purpose Language Understanding: Capable of handling a wide range of natural language processing tasks.
  • 4096 Token Context Window: Supports processing moderately long inputs, suitable for various applications.

Good For

  • Mathematical and Logical Tasks: Ideal for applications requiring accurate numerical and logical reasoning.
  • Instruction-Based Applications: Well-suited for chatbots, virtual assistants, and tools that need to follow complex instructions.
  • Hybrid Use Cases: When a balance between specialized reasoning and broad conversational ability is required.
  • Experimentation with Merged Models: Provides a practical example of how linear merging can combine model strengths.