DeepKarkhanis/Mistral-Passthrough-8L-10B
DeepKarkhanis/Mistral-Passthrough-8L-10B is a 7 billion parameter language model created by DeepKarkhanis, built by merging two instances of mistralai/Mistral-7B-Instruct-v0.2 using a passthrough merge method. This model leverages a unique layer-slicing configuration to potentially enhance or modify the original Mistral-7B-Instruct-v0.2's capabilities. It is designed for general text generation tasks, inheriting the instruction-following strengths of its base model.
Loading preview...
Model Overview
DeepKarkhanis/Mistral-Passthrough-8L-10B is a 7 billion parameter language model developed by DeepKarkhanis. It is constructed through a novel "passthrough" merge of two instances of the mistralai/Mistral-7B-Instruct-v0.2 model. This merging technique, facilitated by LazyMergekit, involves combining specific layer ranges from the source models.
Key Configuration
The model's unique architecture is defined by its passthrough merge method, which integrates layers [0, 24] from one Mistral-7B-Instruct-v0.2 instance and layers [8, 32] from another. This specific layer combination aims to create a distinct model behavior while retaining the foundational strengths of the Mistral architecture.
Intended Use
This model is suitable for general-purpose text generation and instruction-following tasks, building upon the capabilities of its base model, Mistral-7B-Instruct-v0.2. Developers can utilize it for applications requiring conversational AI, content creation, or other tasks where a robust instruction-tuned language model is beneficial. Its float16 dtype ensures efficient inference.