nothingiisreal/MN-12B-Starcannon-v3

Warm
Public
12B
FP8
32768
Hugging Face
Overview

MN-12B-Starcannon-v3 Overview

MN-12B-Starcannon-v3 is a 12 billion parameter language model developed by nothingiisreal. It was created using the TIES merge method from mergekit, combining two distinct pre-trained models to enhance its capabilities.

Merge Details

This model is a merge of:

  • nothingiisreal/MN-12B-Celeste-V1.9 (used as the base model)
  • anthracite-org/magnum-12b-v2

The TIES merge method was applied with specific density and weight parameters for each component, aiming to integrate their strengths. The configuration utilized bfloat16 dtype and included normalize: true and int8_mask: true parameters during the merge process.

Key Characteristics

  • Parameter Count: 12 billion parameters.
  • Context Length: Supports a context window of 32768 tokens.
  • Merge Method: Utilizes the TIES (Trimmed, Iterative, and Self-consistent) merging technique.

Availability

Various quantized versions are available for broader accessibility and deployment:

Intended Use Cases

As a merged model, MN-12B-Starcannon-v3 is designed for general-purpose language tasks, benefiting from the combined knowledge and capabilities of its constituent models. It is suitable for applications requiring robust text generation, comprehension, and conversational abilities.