NexesMess/Llama_3.3_70b_FallenMare

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Apr 4, 2025Architecture:Transformer Cold

NexesMess/Llama_3.3_70b_FallenMare is a 70 billion parameter language model created by NexesMess through a Model Stock merge of several Llama 3.3-based models, using SicariusSicariiStuff/Negative_LLAMA_70B as its base. This model combines the characteristics of TheDrummer/Fallen-Llama-3.3-R1-70B-v1, SentientAGI/Dobby-Unhinged-Llama-3.3-70B, LatitudeGames/Wayfarer-Large-70B-Llama-3.3, and EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1. It is designed to leverage the combined strengths of its constituent models, offering a broad range of general-purpose language capabilities with a 32768 token context length.

Loading preview...

Overview

NexesMess/Llama_3.3_70b_FallenMare is a 70 billion parameter language model, a product of a sophisticated merge operation using the Model Stock method. This model integrates the capabilities of four distinct Llama 3.3-based models, building upon SicariusSicariiStuff/Negative_LLAMA_70B as its foundational base. The merge process was executed using mergekit, a tool for combining pre-trained language models.

Key Capabilities

  • Composite Intelligence: By merging TheDrummer/Fallen-Llama-3.3-R1-70B-v1, SentientAGI/Dobby-Unhinged-Llama-3.3-70B, LatitudeGames/Wayfarer-Large-70B-Llama-3.3, and EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1, this model aims to inherit and synthesize their respective strengths.
  • Model Stock Methodology: Utilizes the Model Stock merge method, detailed in the arXiv paper, which is designed to create robust combined models.
  • 70 Billion Parameters: Offers a substantial parameter count, indicative of strong general language understanding and generation capabilities.
  • 32768 Token Context: Supports a large context window, enabling the processing and generation of longer texts while maintaining coherence.

Good for

  • General-purpose language tasks: Suitable for a wide array of applications requiring robust language understanding and generation.
  • Exploration of merged model performance: Ideal for researchers and developers interested in the outcomes of advanced model merging techniques.
  • Applications benefiting from diverse model characteristics: Leverages the combined strengths of its constituent models for potentially enhanced performance across various domains.