SteelStorage/L3-Aethora-15B-V2

TEXT GENERATIONConcurrency Cost:1Model Size:15BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:Jun 27, 2024License:cc-by-sa-4.0Architecture:Transformer0.0K Open Weights Cold

SteelStorage/L3-Aethora-15B-V2 is a 15 billion parameter language model developed by ZeusLabs, built upon the Llama 3 architecture. Trained on a curated dataset for 17.5 hours, it features an 8192-token context length. This model is specifically optimized for creative writing, storytelling, and general intelligence tasks, demonstrating proficiency in generating narratives and engaging in detailed scientific discussions.

Loading preview...

Overview

L3-Aethora-15B v2 is an advanced 15 billion parameter language model from ZeusLabs, based on the Llama 3 architecture. It was trained for 17.5 hours on 4 x A100 GPUs using LoRA (Low-Rank Adaptation) for 4 epochs, with a sequence length of 8192 tokens. The model leverages the Aether-Lite-V1.8.1 dataset, which was meticulously collected from 12 diverse high-quality sources, preprocessed for language detection and sanitization, and deduplicated using advanced fuzzy techniques to ensure uniqueness and quality.

Key Capabilities

  • Creative Writing and Storytelling: Excels at generating engaging narratives, poetry, and creative content, adapting to various genres and tones.
  • General Intelligence: Capable of detailed discussions on medical and scientific topics, explaining complex phenomena, and assisting in literature review.
  • Instructional and Educational Content: Creates comprehensive tutorials, how-to guides, and educational materials with clarity.
  • Reasoning and Problem-Solving: Analyzes complex scenarios, provides logical solutions, and engages in step-by-step problem-solving.
  • Contextual Understanding: Maintains coherent, context-aware conversations across extended interactions and adapts communication style.

Training Details

The model was fine-tuned from elinas/Llama-3-15B-Instruct-zeroed. The Aether-Lite-V1.8.1 dataset, comprising 125,119 high-quality samples, underwent rigorous data collection, preprocessing (including language detection and text sanitization), and fuzzy deduplication with a 95% similarity threshold. This process ensured a balanced dataset rich in creative writing, practical knowledge, and intellectual depth.

Benchmarks

According to the Open LLM Leaderboard, L3-Aethora-15B v2 achieved an average score of 24.57. Specific metrics include:

  • IFEval (0-Shot): 72.08
  • BBH (3-Shot): 28.97
  • MMLU-PRO (5-shot): 27.78

Good for

This model is particularly well-suited for applications requiring strong creative text generation, detailed scientific explanations, educational content creation, and robust reasoning capabilities within an 8192-token context.