Sela223/Aether-Script_12B

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Mar 12, 2026Architecture:Transformer Cold

Sela223/Aether-Script_12B is a 12 billion parameter language model created by Sela223, formed by merging Sela223/Repose-Marlin-12B and Sela223/Captain-Foxfire-12B using the SLERP method. This model integrates capabilities from its constituent models, offering a combined performance profile. It is designed for general language tasks, leveraging its merged architecture for enhanced versatility.

Loading preview...

Overview

Sela223/Aether-Script_12B is a 12 billion parameter language model developed by Sela223. It is a merged model, combining the strengths of two pre-trained language models: Sela223/Repose-Marlin-12B and Sela223/Captain-Foxfire-12B. The merge was performed using the SLERP (Spherical Linear Interpolation) method, a technique known for smoothly interpolating between model weights to create a new model that inherits characteristics from its parents.

Merge Details

This model was constructed using mergekit, a tool for combining different language models. The specific configuration involved merging all 40 layers from both base models, with a detailed parameter configuration applied to various components like attention projections (q_proj, k_proj, v_proj, o_proj), self-attention, MLP layers (gate_proj, up_proj, down_proj), and layernorms. This fine-grained control over the merging process aims to optimize the integration of features from the source models.

Key Characteristics

  • Parameter Count: 12 billion parameters.
  • Merge Method: Utilizes the SLERP method for combining model weights.
  • Constituent Models: Built upon Sela223/Repose-Marlin-12B and Sela223/Captain-Foxfire-12B.

Potential Use Cases

Given its merged nature, Aether-Script_12B is likely suitable for a range of general-purpose language generation and understanding tasks, benefiting from the combined knowledge and capabilities of its base models. Developers can experiment with this model for applications requiring a robust 12B parameter foundation.