Sao10K/70B-L3.3-Cirrus-x1

Warm
Public
70B
FP8
32768
License: llama3.3
Hugging Face
Overview

Sao10K/70B-L3.3-Cirrus-x1 Overview

Sao10K/70B-L3.3-Cirrus-x1 is a 70 billion parameter language model developed by Sao10K, distinguished by its unique training methodology. It leverages a data composition akin to the 'Freya' model but benefits from a longer training duration and incorporates merging with its checkpoints, resulting in a more stable version compared to previous iterations.

Key Characteristics

  • Training Data Composition: Utilizes a data composition similar to 'Freya', but applied differently and trained for an extended period.
  • Stability: Enhanced stability achieved through longer training and the merging of multiple epoch checkpoints using techniques like dare_ties.
  • Stylistic Output: Known for a distinct output style, with minor issues that are generally easy to correct.
  • Prompt Format: Optimized for the Llama-3-Instruct prompt format.
  • Context Length: Supports a context length of 32,768 tokens.

Recommended Usage

This model is suitable for users seeking a stable 70B model with a particular stylistic output. While specific use cases are not detailed, its general text generation capabilities make it versatile. The developer recommends specific inference settings:

  • Temperature: 1.1
  • min_p: 0.05

Users are encouraged to experiment with various sampling methods, though the developer's experience is limited to the specified settings.