TareksLab/DarkThoughts-V3-LLaMa-70B

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kArchitecture:Transformer Gated Cold

TareksLab/DarkThoughts-V3-LLaMa-70B is a 70 billion parameter language model created by TareksLab, built upon the Llama 3.3 architecture. This model is a merge of several pre-trained Llama 3.3-based models, including Vulpecula-r1, Wanton-Wolf, and Wayfarer-Large, using the Model Stock merge method. With a 32768 token context length, it is designed to combine the strengths of its constituent models for enhanced general-purpose language generation and understanding.

Loading preview...

DarkThoughts-V3-LLaMa-70B: A Merged Llama 3.3 Model

DarkThoughts-V3-LLaMa-70B is a 70 billion parameter language model developed by TareksLab. It is constructed using the Model Stock merge method, combining multiple Llama 3.3-based models to leverage their collective capabilities. The base model for this merge is huihui-ai/Llama-3.3-70B-Instruct-abliterated.

Key Merge Components

This model integrates the following Llama 3.3-70B variants:

  • Sao10K/Llama-3.3-70B-Vulpecula-r1
  • Mawdistical/Wanton-Wolf-70B
  • LatitudeGames/Wayfarer-Large-70B-Llama-3.3

Technical Specifications

  • Parameter Count: 70 Billion
  • Context Length: 32768 tokens
  • Merge Method: Model Stock, designed to combine the strengths of its constituent models.
  • Tokenizer: Utilizes the tokenizer from Sao10K/Llama-3.3-70B-Vulpecula-r1, configured for Llama 3 chat templates.

Potential Use Cases

Given its foundation in multiple Llama 3.3-70B models and a substantial context window, DarkThoughts-V3-LLaMa-70B is suitable for:

  • Advanced text generation tasks requiring nuanced understanding.
  • Applications benefiting from a broad knowledge base and extended context.
  • Exploration of merged model performance in complex language understanding and generation scenarios.