Naphula/Wicked-Moondust-12B
Naphula/Wicked-Moondust-12B is a 12 billion parameter causal language model based on the MistralForCausalLM architecture. This model is a multi-stage merge of three pre-trained models (Ethereal-Stardust-12B, Wicked-Oblivion-12B, and Nether-Moon-12B) using the Arcee Fusion method. It is designed to combine the strengths of its constituent models, offering a versatile foundation for various natural language processing tasks. The model supports a context length of 32768 tokens.
Loading preview...
Wicked Moondust 12B Overview
Wicked-Moondust-12B is a 12 billion parameter language model developed by Naphula. It is constructed through a multi-stage merging process using mergekit and specifically the Arcee Fusion method. This approach combines the capabilities of several base models to create a more robust and generalized model.
Key Characteristics
- Architecture: Based on the MistralForCausalLM architecture, providing a strong foundation for causal language modeling tasks.
- Parameter Count: Features 12 billion parameters, balancing performance with computational efficiency.
- Context Length: Supports a substantial context window of 32768 tokens, enabling the processing of longer inputs and generating more coherent, extended outputs.
- Merge Method: Utilizes the Arcee Fusion method for combining models, which is designed to effectively integrate the strengths of multiple source models.
Models Merged
The Wicked-Moondust-12B model is a composite of the following 12B parameter models:
- Vortex5/Ethereal-Stardust-12B
- Vortex5/Wicked-Oblivion-12B
- Vortex5/Nether-Moon-12B
This merging strategy aims to leverage the diverse training and fine-tuning of its constituent models, making Wicked-Moondust-12B a versatile option for developers seeking a capable 12B model.