Azazelle/Mocha-SR-7b-ex

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 23, 2024License:cc-by-4.0Architecture:Transformer Open Weights Cold

Mocha-SR-7b-ex is a 7 billion parameter language model developed by Azazelle, built upon the Mistral-7B-v0.1 architecture. This model is a merge of Open-Orca/Mistral-7B-OpenOrca, WizardLM/WizardMath-7B-V1.1, and akjindal53244/Mistral-7B-v0.1-Open-Platypus, utilizing the rescaled_sample merge method. It is designed to combine the strengths of its constituent models, offering enhanced capabilities for general language tasks, instruction following, and mathematical reasoning within a 4096-token context window.

Loading preview...

Mocha-SR-7b-ex: A Merged 7B Language Model

Mocha-SR-7b-ex is a 7 billion parameter language model created by Azazelle, leveraging the robust Mistral-7B-v0.1 as its base architecture. This model was developed using the rescaled_sample merge method via mergekit, combining several specialized models to enhance its overall performance and versatility.

Key Capabilities

  • Instruction Following: Integrates capabilities from Open-Orca/Mistral-7B-OpenOrca, known for its strong instruction-following abilities.
  • Mathematical Reasoning: Incorporates WizardLM/WizardMath-7B-V1.1, providing improved performance on mathematical problems and logical reasoning tasks.
  • General Language Understanding: Benefits from akjindal53244/Mistral-7B-v0.1-Open-Platypus, contributing to broad language understanding and generation.
  • Efficient Architecture: Built on the Mistral-7B-v0.1 base, offering a balance of performance and computational efficiency with a 4096-token context length.

Good For

This model is well-suited for applications requiring a combination of general conversational AI, precise instruction adherence, and enhanced mathematical problem-solving. Its merged nature aims to provide a more balanced and capable model for diverse NLP tasks compared to its individual components.