L3-Aethora-15B: A Compliant and Creative Llama3-based Model
L3-Aethora-15B, developed by Steelskull, is a 15 billion parameter model built upon the Llama3 architecture. Its core innovation lies in an "abilteration" method designed to adjust model responses, specifically inhibiting refusal and promoting more compliant and facilitative dialogue. This makes it particularly adept at generating user-friendly and cooperative interactions.
Key Capabilities & Training:
- Refusal Inhibition: Engineered to provide compliant and facilitative responses, reducing model refusals.
- Modified DUS Merge: Utilizes a modified Depth Up Scale (DUS) merge with specific adjustments to 'o_proj' and 'down_proj' for enhanced efficiency and reduced perplexity.
- Balanced Training Data: Trained for 4 epochs using Rslora & DORA methods on the Aether-Lite-V1.2 dataset (~82,000 high-quality samples).
- Creative & Intelligent Output: The dataset is designed to strike a 60/40 balance between creativity/"slop" and intelligence, aiming for versatile output.
- Llama3 Prompt Format: Optimized for the Llama3 prompt format.
Dataset Highlights:
The Aether-Lite-V1.2 dataset was carefully filtered, removing phrases like "GPTslop" and "Claudism's," and includes diverse sources such as mrfakename/Pure-Dove-ShareGPT, mrfakename/Capybara-ShareGPT, jondurbin/airoboros-3.2, and various grimulkan datasets focusing on theory of mind, augmented data, and physical reasoning. The dataset underwent deduplication, reducing the initial 85,628 rows to 81,960 high-quality samples.
Good For:
- Applications requiring highly compliant and non-refusal-prone AI interactions.
- Generating creative and engaging dialogue with a balanced approach to intelligence.
- Use cases where a Llama3-based model with specific response conditioning is beneficial.