Naphula-Archives/Acid2501-24B
Acid2501-24B by Naphula-Archives is a 24 billion parameter language model built on the MistralForCausalLM architecture, specifically designed as a '2501 only test' for Goetia. This model is a merge of several Mistral-based models, including Mistral-Small-24B-Instruct-2501, Arcee-Blitz, ArliAI-RPMax-v1.4, Dolphin-Mistral-24B-Venice-Edition, Dans-DangerousWinds-V1.1.1-24b, ReadyArt's Broken-Tutu variants, Cydonia-24B-v2, MS-24B-Instruct-Mullein-v0, BlackSheep-24B, and MistralThinker-v1.1. It features a 32768 token context length and is primarily intended for specific experimental evaluations within the Goetia framework, rather than general-purpose applications.
Loading preview...
Acid2501 24B Overview
Acid2501-24B is a 24 billion parameter language model developed by Naphula-Archives, built upon the MistralForCausalLM architecture. This model is explicitly described as a "2501 only test for Goetia," indicating its specialized purpose in evaluating specific aspects or functionalities within the Goetia framework.
Key Characteristics
- Architecture: Based on the MistralForCausalLM family, leveraging its efficient causal language modeling capabilities.
- Parameter Count: A substantial 24 billion parameters, providing a balance between performance and computational requirements.
- Context Length: Supports a context window of 32768 tokens, allowing for processing and generating longer sequences of text.
- Model Merging: Acid2501-24B is a merge of multiple Mistral-based models using the
dellamerge method. Notable components includemistralai/Mistral-Small-24B-Instruct-2501as the base model, alongside contributions fromarcee-ai/Arcee-Blitz,ArliAI/Mistral-Small-24B-ArliAI-RPMax-v1.4,dphn/Dolphin-Mistral-24B-Venice-Edition,PocketDoc/Dans-DangerousWinds-V1.1.1-24b,ReadyArt/4.2.0-Broken-Tutu-24b,ReadyArt/Broken-Tutu-24B-Transgression-v2.0,TheDrummer/Cydonia-24B-v2,trashpanda-org/MS-24B-Instruct-Mullein-v0,TroyDoesAI/BlackSheep-24B, andUndi95/MistralThinker-v1.1.
Good for
- Experimental Evaluation: Primarily suited for specific testing and evaluation within the 'Goetia' framework, as indicated by its "2501 only test" designation.
- Research into Model Merging: Useful for researchers interested in the effects and performance of
dellamerge methods on various Mistral-based models. - Specialized Benchmarking: Can be employed for benchmarking tasks that align with its specific '2501' focus, though general performance is noted as "decent but not super impressive."