Overview

L3-Aethora-15B v2 is an advanced 15 billion parameter language model from ZeusLabs, based on the Llama 3 architecture. It was trained for 17.5 hours on 4 x A100 GPUs using LoRA (Low-Rank Adaptation) for 4 epochs, with a sequence length of 8192 tokens. The model leverages the Aether-Lite-V1.8.1 dataset, which was meticulously collected from 12 diverse high-quality sources, preprocessed for language detection and sanitization, and deduplicated using advanced fuzzy techniques to ensure uniqueness and quality.

Key Capabilities

Creative Writing and Storytelling: Excels at generating engaging narratives, poetry, and creative content, adapting to various genres and tones.
General Intelligence: Capable of detailed discussions on medical and scientific topics, explaining complex phenomena, and assisting in literature review.
Instructional and Educational Content: Creates comprehensive tutorials, how-to guides, and educational materials with clarity.
Reasoning and Problem-Solving: Analyzes complex scenarios, provides logical solutions, and engages in step-by-step problem-solving.
Contextual Understanding: Maintains coherent, context-aware conversations across extended interactions and adapts communication style.

Training Details

The model was fine-tuned from elinas/Llama-3-15B-Instruct-zeroed. The Aether-Lite-V1.8.1 dataset, comprising 125,119 high-quality samples, underwent rigorous data collection, preprocessing (including language detection and text sanitization), and fuzzy deduplication with a 95% similarity threshold. This process ensured a balanced dataset rich in creative writing, practical knowledge, and intellectual depth.

Benchmarks

According to the Open LLM Leaderboard, L3-Aethora-15B v2 achieved an average score of 24.57. Specific metrics include:

IFEval (0-Shot): 72.08
BBH (3-Shot): 28.97
MMLU-PRO (5-shot): 27.78

Good for

This model is particularly well-suited for applications requiring strong creative text generation, detailed scientific explanations, educational content creation, and robust reasoning capabilities within an 8192-token context.

Overview

Overview

Key Capabilities

Training Details

Benchmarks

Good for

Full Model Card (README)