automerger/ShadowYamshadow-7B
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 19, 2024License:apache-2.0Architecture:Transformer Open Weights Cold
ShadowYamshadow-7B is a 7 billion parameter language model created by Maxime Labonne, resulting from an automated merge of CorticalStack/shadow-clown-7B-slerp and automerger/YamShadow-7B. This model leverages a slerp merge method to combine the strengths of its constituent models, offering a balanced performance across general language tasks. It is designed for applications requiring a compact yet capable model with a 4096-token context length.
Loading preview...
ShadowYamshadow-7B: An Automated Merge Model
ShadowYamshadow-7B is a 7 billion parameter language model developed by Maxime Labonne. It is an automated merge created using a specific configuration that combines two base models: CorticalStack/shadow-clown-7B-slerp and automerger/YamShadow-7B.
Key Characteristics
- Merge Method: Utilizes the
slerp(spherical linear interpolation) merge method, which is effective for combining model weights while preserving desirable features from both sources. - Configuration: The merge process involved specific parameter adjustments for self-attention (
self_attn) and multi-layer perceptron (mlp) layers, indicating a fine-tuned approach to weight blending. - Base Models: Built upon existing 7B parameter models, suggesting a focus on leveraging established architectures and capabilities.
- Context Length: Supports a context window of 4096 tokens, suitable for a variety of conversational and text generation tasks.
Good For
- General Text Generation: Capable of handling diverse language understanding and generation tasks.
- Experimentation with Merged Models: Provides a practical example of an automatically merged model, useful for researchers and developers exploring model combination techniques.
- Applications requiring a 7B parameter model: Offers a compact yet capable solution for deployment where larger models might be impractical.