MN-12B-Starcannon-v3 Overview
MN-12B-Starcannon-v3 is a 12 billion parameter language model developed by nothingiisreal. It was created using the TIES merge method from mergekit, combining two distinct pre-trained models to enhance its capabilities.
Merge Details
This model is a merge of:
- nothingiisreal/MN-12B-Celeste-V1.9 (used as the base model)
- anthracite-org/magnum-12b-v2
The TIES merge method was applied with specific density and weight parameters for each component, aiming to integrate their strengths. The configuration utilized bfloat16 dtype and included normalize: true and int8_mask: true parameters during the merge process.
Key Characteristics
- Parameter Count: 12 billion parameters.
- Context Length: Supports a context window of 32768 tokens.
- Merge Method: Utilizes the TIES (Trimmed, Iterative, and Self-consistent) merging technique.
Availability
Various quantized versions are available for broader accessibility and deployment:
Intended Use Cases
As a merged model, MN-12B-Starcannon-v3 is designed for general-purpose language tasks, benefiting from the combined knowledge and capabilities of its constituent models. It is suitable for applications requiring robust text generation, comprehension, and conversational abilities.