Acid2501 24B Overview
Acid2501-24B is a 24 billion parameter language model developed by Naphula-Archives, built upon the MistralForCausalLM architecture. This model is explicitly described as a "2501 only test for Goetia," indicating its specialized purpose in evaluating specific aspects or functionalities within the Goetia framework.
Key Characteristics
- Architecture: Based on the MistralForCausalLM family, leveraging its efficient causal language modeling capabilities.
- Parameter Count: A substantial 24 billion parameters, providing a balance between performance and computational requirements.
- Context Length: Supports a context window of 32768 tokens, allowing for processing and generating longer sequences of text.
- Model Merging: Acid2501-24B is a merge of multiple Mistral-based models using the
della merge method. Notable components include mistralai/Mistral-Small-24B-Instruct-2501 as the base model, alongside contributions from arcee-ai/Arcee-Blitz, ArliAI/Mistral-Small-24B-ArliAI-RPMax-v1.4, dphn/Dolphin-Mistral-24B-Venice-Edition, PocketDoc/Dans-DangerousWinds-V1.1.1-24b, ReadyArt/4.2.0-Broken-Tutu-24b, ReadyArt/Broken-Tutu-24B-Transgression-v2.0, TheDrummer/Cydonia-24B-v2, trashpanda-org/MS-24B-Instruct-Mullein-v0, TroyDoesAI/BlackSheep-24B, and Undi95/MistralThinker-v1.1.
Good for
- Experimental Evaluation: Primarily suited for specific testing and evaluation within the 'Goetia' framework, as indicated by its "2501 only test" designation.
- Research into Model Merging: Useful for researchers interested in the effects and performance of
della merge methods on various Mistral-based models. - Specialized Benchmarking: Can be employed for benchmarking tasks that align with its specific '2501' focus, though general performance is noted as "decent but not super impressive."