Naphula-Archives/Checkpoint-T7-24B

TEXT GENERATIONConcurrency Cost:2Model Size:24BQuant:FP8Ctx Length:32kPublished:Mar 22, 2026Architecture:Transformer Cold

Checkpoint-T7-24B is a 24 billion parameter causal language model developed by Naphula-Archives, created using the della merge method. This model integrates several base models including Slimaki, Maginum Cydoms, Asmodeus v1, Asmodeus v2a, Asmodeus v2e, Magistry, and Checkpoint T6. It is noted for producing complex and tangential outputs, often using elaborate language to avoid direct instruction following, and has a context length of 32768 tokens.

Loading preview...

Model Overview

Checkpoint-T7-24B is a 24 billion parameter language model developed by Naphula-Archives, built using the della merge method. This model is a complex merge of several base models, including Slimaki, Maginum Cydoms, Asmodeus v1, Asmodeus v2a, Asmodeus v2e, Magistry, and Checkpoint T6. The development process involved iterative merging, with observations on the behavior of della merges, noting a tendency towards unnormalized outputs and complex, tangential responses.

Key Characteristics

  • Merge Architecture: Utilizes the della merge method to combine multiple 24B parameter models, including various Asmodeus versions, Slimaki, Maginum Cydoms, and Magistry.
  • Output Style: Known for generating verbose and elaborate responses that often deviate from direct instruction, characterized by "fancy words" and tangents.
  • Development Insights: The README details experimental merging with della, ties, and dare_ties, highlighting challenges such as over-unnormalization with della leading to irrelevant tangents, and grammar collapse with ties when merging specific model versions.

Considerations for Use

This model is experimental and exhibits unique conversational patterns. Users should be aware of its tendency to produce lengthy, complex, and sometimes evasive outputs rather than direct answers. It is not recommended for tasks requiring precise instruction following or concise responses. The model's behavior suggests it might be suitable for exploring unconventional language generation or as a base for further fine-tuning to mitigate its noted characteristics.