darkc0de/BuddyGlassNeverSleeps

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Sep 16, 2024Architecture:Transformer0.0K Cold

darkc0de/BuddyGlassNeverSleeps is an 8 billion parameter language model merged using the della method, based on Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2 and incorporating mlabonne/NeuralDaredevil-8B-abliterated. This model leverages a 32768 token context length and is designed for general language tasks, with its performance evaluated across various benchmarks including IFEval and BBH. Its merge configuration emphasizes normalized weights and int8 masking for optimized performance.

Loading preview...

Model Overview

darkc0de/BuddyGlassNeverSleeps is an 8 billion parameter language model created through a merge of pre-trained models using the mergekit tool. It utilizes the della merge method, with Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2 serving as the base model and mlabonne/NeuralDaredevil-8B-abliterated contributing to its architecture. The merge configuration includes specific parameters such as normalized weights, int8 masking, and a density of 0.7, aiming for a balanced and optimized model.

Key Characteristics

  • Merge Method: Della merge, combining two distinct 8B parameter models.
  • Base Model: Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2.
  • Context Length: Supports a substantial 32768 token context window.
  • Configuration: Features int8_mask and normalize parameters in its merge, suggesting an emphasis on efficiency and performance.

Performance Benchmarks

Evaluated on the Open LLM Leaderboard, BuddyGlassNeverSleeps achieved an average score of 19.73. Specific benchmark results include:

  • IFEval (0-Shot): 42.39
  • BBH (3-Shot): 28.48
  • MMLU-PRO (5-shot): 27.25

These scores provide insight into its capabilities across various reasoning and knowledge-based tasks.

Use Cases

This model is suitable for general language generation and understanding tasks where a merged architecture with a large context window is beneficial. Its performance on benchmarks like IFEval and BBH indicates potential for tasks requiring instruction following and complex reasoning.