Azazelle/Mocha-Sample-7b-ex

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 23, 2024License:cc-by-4.0Architecture:Transformer0.0K Open Weights Cold

Azazelle/Mocha-Sample-7b-ex is a 7 billion parameter language model created by Azazelle, built upon the Mistral-7B-v0.1 base architecture. This model is a merge of WizardMath-7B-V1.1, Mistral-7B-v0.1-Open-Platypus, and Mistral-7B-OpenOrca, utilizing the sample_ties merge method. It is designed to combine the strengths of its constituent models, particularly in areas like mathematical reasoning and general instruction following, with a context length of 4096 tokens.

Loading preview...

Model Overview

Azazelle/Mocha-Sample-7b-ex is a 7 billion parameter language model developed by Azazelle. It is constructed using the sample_ties merge method, combining several pre-trained models based on the Mistral-7B-v0.1 architecture. This merging approach aims to integrate diverse capabilities from its source models into a single, cohesive unit.

Key Capabilities

This model is a merge of three distinct models, suggesting a blend of their respective strengths:

  • WizardLM/WizardMath-7B-V1.1: Likely contributes to enhanced mathematical reasoning and problem-solving abilities.
  • akjindal53244/Mistral-7B-v0.1-Open-Platypus: Suggests strong performance in instruction following and general conversational tasks.
  • Open-Orca/Mistral-7B-OpenOrca: Indicates robust capabilities in complex instruction understanding and response generation.

Merge Details

The model was created using mergekit with a specific YAML configuration that defines the weighting and density parameters for each merged model. The base model for this merge was mistralai/Mistral-7B-v0.1, and the final model is provided in float16 data type with int8_mask enabled for potential quantization benefits.

Good For

This merged model is suitable for applications requiring a combination of:

  • Mathematical problem-solving and logical reasoning.
  • General-purpose instruction following and conversational AI.
  • Handling complex prompts and generating coherent, relevant responses.