Marcjoni/HyperNovaSynth-12B

TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kArchitecture:Transformer0.0K Cold

Marcjoni/HyperNovaSynth-12B is a 12 billion parameter language model, created by Marcjoni through a slerp merge of Marcjoni/SuperNovaSynth-12B and yamatazen/LorablatedStock-12B. This model is stable at a 12k token context length and supports ChatML style prompting. It is designed for general text generation tasks, offering flexibility in sampling settings for varied output styles.

Loading preview...

HyperNovaSynth-12B: A Merged Language Model

HyperNovaSynth-12B is a 12 billion parameter language model developed by Marcjoni. It is the result of a slerp merge operation, combining the base model Marcjoni/SuperNovaSynth-12B with yamatazen/LorablatedStock-12B.

Key Capabilities & Features

  • Architecture: A merged model, leveraging the strengths of its constituent models through a slerp merge method.
  • Context Length: Stable performance up to 12,000 tokens, with potential for extended contexts.
  • Prompt Format: Supports the widely used ChatML style for conversational interactions.
  • Sampling Flexibility: Recommended sampling settings include a temperature range of 0.75 to 1.25 and a Min P of 0.035, allowing for diverse generation outputs.

Technical Configuration

The merge configuration specifically applied different t values to the MLP (0.75) and attention (0.35) layers, with a general value of 0.55 for other parameters, indicating a fine-tuned merging process. The model utilizes bfloat16 for its dtype.

Usage Considerations

Developers can easily integrate HyperNovaSynth-12B using the transformers library, with provided Python code examples for quick setup and text generation. The model is suitable for general text generation tasks where a 12B parameter model with a substantial context window is beneficial.