Kukedlc/NeuralKrishna-7B-v3
NeuralKrishna-7B-v3 is a 7 billion parameter language model developed by Kukedlc, created through a merge of NeuralGlitch-Yam-Peleg-7B-DT, Fasciculus-Arcuatus-7B-slerp, and Neural4gsm8k using the DARE TIES merge method. This model is built upon the mlabonne/Monarch-7B base and is designed for general text generation tasks, leveraging its merged architecture for potentially enhanced performance across various domains. It supports a context length of 4096 tokens, making it suitable for applications requiring moderate input and output lengths.
Loading preview...
NeuralKrishna-7B-v3 Overview
NeuralKrishna-7B-v3 is a 7 billion parameter language model developed by Kukedlc, constructed by merging three distinct models: NeuralGlitch-Yam-Peleg-7B-DT, Fasciculus-Arcuatus-7B-slerp, and Neural4gsm8k. This merge was performed using the DARE TIES method, with mlabonne/Monarch-7B serving as the base model.
Key Characteristics
- Merged Architecture: Combines specific strengths of its constituent models through a weighted merge process, aiming for a balanced performance profile.
- Parameter Efficient: At 7 billion parameters, it offers a balance between performance and computational resource requirements.
- Configuration Details: The merge configuration specifies
densityandweightparameters for each contributing model, along with anint8_maskandbfloat16data type for optimized inference.
Good For
- General Text Generation: Suitable for a wide array of natural language processing tasks, including question answering, content creation, and conversational AI.
- Exploration of Merged Models: Provides a practical example of a model created via advanced merging techniques like DARE TIES, useful for researchers and developers interested in model fusion.