LeroyDyer/Mixtral_AI_Cyber_Matrix_2_0

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Apr 4, 2024License:mitArchitecture:Transformer0.0K Open Weights Cold

LeroyDyer/Mixtral_AI_Cyber_Matrix_2_0 is a 7 billion parameter model merge based on Mistral-7B-Instruct-v0.2, integrating diverse capabilities from multiple specialized models. It features an expanded context window of 8192 tokens and incorporates hidden tensors for potential vision and sound processing, awaiting further fine-tuning. This model is designed for versatile applications including roleplay, coding, general chat, medical inference, and long-context data handling.

Loading preview...

Model Overview

LeroyDyer/Mixtral_AI_Cyber_Matrix_2_0 is a 7 billion parameter model built upon the Mistral-7B-Instruct-v0.2 base, designed to offer a wide array of functionalities through a sophisticated merge of various expert models. This model incorporates 'hidden tensors' for advanced capabilities like vision and sound processing, which are present but require specific fine-tuning to be fully activated. It leverages a Mixture of Experts (MoE) approach, integrating specialized sub-models to enhance performance and scalability across diverse tasks.

Key Capabilities

This model merge brings together strengths from numerous specialized models, resulting in a versatile AI capable of:

  • Roleplay and Chat: Enhanced conversational abilities from models like Eris-LelantaclesV2-7b and Hyperion-2.1-Mistral-7B.
  • Coding: Improved code generation and understanding via integration with BASH-Coder-Mistral-7B.
  • Vision: Contains latent vision capabilities from Eris_PrimeV3-Vision-7B, awaiting activation through specific configuration.
  • Medical Inference: Specialized knowledge from ProdigyXBioMistral_7B for medical applications.
  • Long Context Handling: Supports extended context lengths up to 128k tokens, derived from models like Infinite-Mika-7b and Nous-Yarn-Mistral-7b-128k.
  • Generalization and Reasoning: Benefits from Hermes-2-Pro-Mistral-7B for general tasks and Nexus-IKM-Mistral-7B-Pytorch for enhanced 'thinking' capabilities.

Unique Aspects

What sets this model apart is its architecture, which includes pre-merged tensors for functionalities like vision and sound. While these require further fine-tuning to be fully exposed, their presence indicates a foundation for multimodal capabilities. The model's design allows for easy activation of these hidden features by setting it to training mode and applying a PEFT adapter. It also features an expanded context window, making it suitable for applications requiring extensive information processing.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p