LeroyDyer/Mixtral_AI_Cyber_5.0

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

LeroyDyer/Mixtral_AI_Cyber_5.0 is a 7 billion parameter language model developed by LeroyDyer, built upon the Mistral transformer network. This model integrates a diverse array of expert models, including OpenOrca, Hermes 2 Pro, Starling-LM-7B-beta, and Phi-1.5, to enhance versatility and efficiency. It features an expanded context window of 8192 tokens and advanced routing mechanisms, making it suitable for a wide range of tasks requiring integrated specialized knowledge.

Loading preview...

Model Overview

LeroyDyer/Mixtral_AI_Cyber_5.0 is a 7 billion parameter language model based on the Mistral transformer network, specifically Mistral-7B-Instruct-v0.2. It functions as a Mixture of Experts (MoE) model, integrating several specialized sub-models to enhance its capabilities and efficiency. The model aims to consolidate the internal predictive nature of the network by merging models that have undergone different fine-tuning processes.

Key Capabilities & Integrated Experts

  • Versatility and Efficiency: Designed for a broad range of tasks through its MoE architecture.
  • Expanded Context Window: Features an 8192-token context length for handling longer inputs.
  • Advanced Routing Mechanisms: Facilitates seamless integration and utilization of specialized sub-models.
  • OpenOrca - Mistral-7B-8k: Contributes fine-tuning excellence and strong performance.
  • Hermes 2 Pro: Introduces advanced features like Function Calling and JSON Mode.
  • Starling-LM-7B-beta: Demonstrates adaptability and optimization through Reinforcement Learning from AI Feedback.
  • Phi-1.5 Transformer: Excels in domains such as common sense reasoning and medical inference.
  • BioMistral: Tailored for specific medical applications.
  • Nous-Yarn-Mistral-7b-128k: Specialized in handling long-context data.

Considerations

Due to the merging of various fine-tuned models (e.g., Commercial Orca, Dolphin, Nous, Starling), there may be some data contamination or biases present. Future tuning efforts will focus on specific tasks, leveraging this merged model as a base.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p