DavidAU/L3-Dark-Planet-8B
DavidAU/L3-Dark-Planet-8B is an 8 billion parameter language model created by DavidAU, built using a mergekit blend of Sao10K/L3-8B-Stheno-v3.2, NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS, and Hastagaras/Jamet-8B-L3-MK.V-Blackroot, based on Meta-Llama-3-8B-Instruct. This model is provided in full precision source code for generating various quantized formats (GGUFs, GPTQ, EXL2, AWQ, HQQ) and is designed for flexible deployment. It features an 8192 token context length and is optimized for users who require customizability through its mergekit composition, allowing for adaptation to different Llama 3.1 base models and fine-tuning of layer weights.
Loading preview...
Overview
DavidAU/L3-Dark-Planet-8B is an 8 billion parameter language model developed by DavidAU, constructed using a dare_ties merge method via mergekit. It combines three distinct Llama 3-based models: Sao10K/L3-8B-Stheno-v3.2, NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS, and Hastagaras/Jamet-8B-L3-MK.V-Blackroot, with meta-llama/Meta-Llama-3-8B-Instruct as its base.
Key Capabilities & Customization
- Flexible Format Generation: Provided in full precision "safe tensors" format, enabling the generation of various quantized versions including GGUFs, GPTQ, EXL2, AWQ, and HQQ.
- Mergekit Composition: The model's creation via
mergekitallows for significant customization. Users can replace the base model with Llama 3.1 or Nvidia Llama 3.1 models to create 128k context versions or different "flavors." - Layer-Specific Weighting: The merge process allows for fine-tuning of weights across 4-layer blocks, with the option to expand to 32 layers for granular control over each model's contribution.
- High-Quality Settings: Emphasizes the importance of specific parameter, sampler, and advanced sampler settings for optimal operation, detailed in an external guide for maximizing performance across various use cases, including chat and roleplay.
Good for
- Developers and researchers looking for a highly customizable Llama 3-based model.
- Users who need to generate various quantized formats for different deployment environments.
- Experimentation with model merging and fine-tuning layer contributions.
- Those seeking a model with an 8192 token context length that can be adapted for diverse applications through careful parameter tuning.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.