Model Overview
Sao10K/70B-L3.3-mhnnn-x1 is a 70 billion parameter language model with a 32,768 token context length, developed by Sao10K. It was trained for approximately 14 hours on an 8xH100 node. The model utilizes a Llama-3-Instruct prompt format and is noted for its creative output, though this can sometimes lead to 'brainfarts' that are typically resolved with regeneration.
Key Capabilities
- Creative Completion: Optimized for generating novels and eBooks.
- Text Adventure Narrator: Capable of acting as a text adventure narrator when prompted with specific system instructions and a one-shot example.
- Amoral Assistant: Can function as an amoral or neutral assistant by including specific terms in the system prompt.
- General Instruction Following: Handles standard assistant tasks and roleplay scenarios.
Training Details
The model's training data composition is similar to 'Freya' but applied differently. It includes a mix of completion data (eBooks, novels) and chat template data (amoral assistant, Hespera, RPG adventure instructions). The training utilized axolotl with LoRA adaptation, featuring lora_r: 64, lora_alpha: 64, and lora_dropout: 0.2. It also incorporates Liger plugins for enhanced rope, RMS norm, layer norm, and GLU activation.