Overview
Sao10K/70B-L3.3-Cirrus-x1 Overview
Sao10K/70B-L3.3-Cirrus-x1 is a 70 billion parameter language model developed by Sao10K, distinguished by its unique training methodology. It leverages a data composition akin to the 'Freya' model but benefits from a longer training duration and incorporates merging with its checkpoints, resulting in a more stable version compared to previous iterations.
Key Characteristics
- Training Data Composition: Utilizes a data composition similar to 'Freya', but applied differently and trained for an extended period.
- Stability: Enhanced stability achieved through longer training and the merging of multiple epoch checkpoints using techniques like
dare_ties. - Stylistic Output: Known for a distinct output style, with minor issues that are generally easy to correct.
- Prompt Format: Optimized for the Llama-3-Instruct prompt format.
- Context Length: Supports a context length of 32,768 tokens.
Recommended Usage
This model is suitable for users seeking a stable 70B model with a particular stylistic output. While specific use cases are not detailed, its general text generation capabilities make it versatile. The developer recommends specific inference settings:
- Temperature: 1.1
- min_p: 0.05
Users are encouraged to experiment with various sampling methods, though the developer's experience is limited to the specified settings.