Eric111/Mayoroya
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 8, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Mayoroya is a 7 billion parameter language model created by Eric111, formed by merging Eric111/Mayo and Eric111/Roya using the slerp method. This model combines the characteristics of its constituent models, offering a balanced performance profile for general language tasks. Its architecture is designed for efficient inference while maintaining a 4096-token context length.

Loading preview...

Mayoroya Model Overview

Mayoroya is a 7 billion parameter language model developed by Eric111, created through a strategic merge of two distinct models: Eric111/Mayo and Eric111/Roya. This merge was performed using the mergekit tool and specifically employed the slerp (spherical linear interpolation) merge method, which is known for producing balanced and coherent merged models.

Key Capabilities

  • Hybrid Performance: By combining two base models, Mayoroya aims to leverage the strengths of both Eric111/Mayo and Eric111/Roya, potentially offering improved generalization across various tasks.
  • Efficient Architecture: As a 7B parameter model, it strikes a balance between performance and computational efficiency, making it suitable for applications where resource constraints are a consideration.
  • Configurable Merge: The merge configuration details, including layer ranges and specific parameter weighting for self_attn and mlp components, indicate a fine-tuned approach to combining the models, suggesting an optimized blend of their respective characteristics.

Good For

  • General Language Tasks: Suitable for a broad range of natural language processing applications due to its merged nature.
  • Exploration of Merged Models: Ideal for developers interested in experimenting with models created via advanced merging techniques to achieve specific performance profiles.
  • Resource-Conscious Deployment: Its 7B parameter size makes it a viable option for deployment in environments where larger models might be too demanding.