DavidAU/L3-Dark-Planet-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Sep 5, 2024Architecture:Transformer0.0K Cold

DavidAU/L3-Dark-Planet-8B is an 8 billion parameter language model created by DavidAU, built using a mergekit blend of Sao10K/L3-8B-Stheno-v3.2, NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS, and Hastagaras/Jamet-8B-L3-MK.V-Blackroot, based on Meta-Llama-3-8B-Instruct. This model is provided in full precision source code for generating various quantized formats (GGUFs, GPTQ, EXL2, AWQ, HQQ) and is designed for flexible deployment. It features an 8192 token context length and is optimized for users who require customizability through its mergekit composition, allowing for adaptation to different Llama 3.1 base models and fine-tuning of layer weights.

Loading preview...

Overview

DavidAU/L3-Dark-Planet-8B is an 8 billion parameter language model developed by DavidAU, constructed using a dare_ties merge method via mergekit. It combines three distinct Llama 3-based models: Sao10K/L3-8B-Stheno-v3.2, NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS, and Hastagaras/Jamet-8B-L3-MK.V-Blackroot, with meta-llama/Meta-Llama-3-8B-Instruct as its base.

Key Capabilities & Customization

  • Flexible Format Generation: Provided in full precision "safe tensors" format, enabling the generation of various quantized versions including GGUFs, GPTQ, EXL2, AWQ, and HQQ.
  • Mergekit Composition: The model's creation via mergekit allows for significant customization. Users can replace the base model with Llama 3.1 or Nvidia Llama 3.1 models to create 128k context versions or different "flavors."
  • Layer-Specific Weighting: The merge process allows for fine-tuning of weights across 4-layer blocks, with the option to expand to 32 layers for granular control over each model's contribution.
  • High-Quality Settings: Emphasizes the importance of specific parameter, sampler, and advanced sampler settings for optimal operation, detailed in an external guide for maximizing performance across various use cases, including chat and roleplay.

Good for

  • Developers and researchers looking for a highly customizable Llama 3-based model.
  • Users who need to generate various quantized formats for different deployment environments.
  • Experimentation with model merging and fine-tuning layer contributions.
  • Those seeking a model with an 8192 token context length that can be adapted for diverse applications through careful parameter tuning.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p