Metis-Chat-7B: A Merged Language Model
Metis-Chat-7B is a 7 billion parameter language model developed by Manolo26, created through a strategic merge of two prominent base models: mlabonne/NeuralBeagle14-7B and mlabonne/NeuralHermes-2.5-Mistral-7B. This model leverages the LazyMergekit tool, specifically employing the slerp (spherical linear interpolation) merge method to combine the weights of its constituents.
Key Capabilities & Configuration
- Merged Architecture: Combines the strengths of
NeuralBeagle14-7B and NeuralHermes-2.5-Mistral-7B to enhance overall performance in chat-based applications. - Parameter Count: Operates with 7 billion parameters, balancing performance with computational efficiency.
- Context Length: Supports a context window of 4096 tokens, suitable for engaging in moderately long conversations.
- Merge Method: Utilizes
slerp for merging, with specific t parameters applied to different layers (self_attn, mlp) to fine-tune the contribution of each base model. - Data Type: Configured to use
bfloat16 for efficient inference.
Ideal Use Cases
- General Chatbots: Well-suited for developing conversational AI agents that require robust language understanding and generation.
- Text Generation: Can be used for various text generation tasks, benefiting from the combined knowledge and stylistic capabilities of its merged components.
- Experimentation: Provides a solid base for researchers and developers looking to experiment with merged models and their performance characteristics.