Sao10K/70B-L3.3-mhnnn-x1

Warm
Public
70B
FP8
32768
License: llama3.3
Hugging Face
Overview

Model Overview

Sao10K/70B-L3.3-mhnnn-x1 is a 70 billion parameter language model with a 32,768 token context length, developed by Sao10K. It was trained for approximately 14 hours on an 8xH100 node. The model utilizes a Llama-3-Instruct prompt format and is noted for its creative output, though this can sometimes lead to 'brainfarts' that are typically resolved with regeneration.

Key Capabilities

  • Creative Completion: Optimized for generating novels and eBooks.
  • Text Adventure Narrator: Capable of acting as a text adventure narrator when prompted with specific system instructions and a one-shot example.
  • Amoral Assistant: Can function as an amoral or neutral assistant by including specific terms in the system prompt.
  • General Instruction Following: Handles standard assistant tasks and roleplay scenarios.

Training Details

The model's training data composition is similar to 'Freya' but applied differently. It includes a mix of completion data (eBooks, novels) and chat template data (amoral assistant, Hespera, RPG adventure instructions). The training utilized axolotl with LoRA adaptation, featuring lora_r: 64, lora_alpha: 64, and lora_dropout: 0.2. It also incorporates Liger plugins for enhanced rope, RMS norm, layer norm, and GLU activation.