Manolo26/metis-chat-7b
Manolo26/metis-chat-7b is a 7 billion parameter language model created by Manolo26, formed by merging mlabonne/NeuralBeagle14-7B and mlabonne/NeuralHermes-2.5-Mistral-7B using the slerp method. This merge combines the strengths of its base models, offering a versatile chat-optimized model with a 4096-token context length. It is designed for general conversational AI applications and text generation tasks.
Loading preview...
Metis-Chat-7B: A Merged Language Model
Metis-Chat-7B is a 7 billion parameter language model developed by Manolo26, created through a strategic merge of two prominent base models: mlabonne/NeuralBeagle14-7B and mlabonne/NeuralHermes-2.5-Mistral-7B. This model leverages the LazyMergekit tool, specifically employing the slerp (spherical linear interpolation) merge method to combine the weights of its constituents.
Key Capabilities & Configuration
- Merged Architecture: Combines the strengths of
NeuralBeagle14-7BandNeuralHermes-2.5-Mistral-7Bto enhance overall performance in chat-based applications. - Parameter Count: Operates with 7 billion parameters, balancing performance with computational efficiency.
- Context Length: Supports a context window of 4096 tokens, suitable for engaging in moderately long conversations.
- Merge Method: Utilizes
slerpfor merging, with specifictparameters applied to different layers (self_attn, mlp) to fine-tune the contribution of each base model. - Data Type: Configured to use
bfloat16for efficient inference.
Ideal Use Cases
- General Chatbots: Well-suited for developing conversational AI agents that require robust language understanding and generation.
- Text Generation: Can be used for various text generation tasks, benefiting from the combined knowledge and stylistic capabilities of its merged components.
- Experimentation: Provides a solid base for researchers and developers looking to experiment with merged models and their performance characteristics.