Marcjoni/HyperNovaSynth-12B
Marcjoni/HyperNovaSynth-12B is a 12 billion parameter language model, created by Marcjoni through a slerp merge of Marcjoni/SuperNovaSynth-12B and yamatazen/LorablatedStock-12B. This model is stable at a 12k token context length and supports ChatML style prompting. It is designed for general text generation tasks, offering flexibility in sampling settings for varied output styles.
Loading preview...
HyperNovaSynth-12B: A Merged Language Model
HyperNovaSynth-12B is a 12 billion parameter language model developed by Marcjoni. It is the result of a slerp merge operation, combining the base model Marcjoni/SuperNovaSynth-12B with yamatazen/LorablatedStock-12B.
Key Capabilities & Features
- Architecture: A merged model, leveraging the strengths of its constituent models through a slerp merge method.
- Context Length: Stable performance up to 12,000 tokens, with potential for extended contexts.
- Prompt Format: Supports the widely used ChatML style for conversational interactions.
- Sampling Flexibility: Recommended sampling settings include a temperature range of 0.75 to 1.25 and a Min P of 0.035, allowing for diverse generation outputs.
Technical Configuration
The merge configuration specifically applied different t values to the MLP (0.75) and attention (0.35) layers, with a general value of 0.55 for other parameters, indicating a fine-tuned merging process. The model utilizes bfloat16 for its dtype.
Usage Considerations
Developers can easily integrate HyperNovaSynth-12B using the transformers library, with provided Python code examples for quick setup and text generation. The model is suitable for general text generation tasks where a 12B parameter model with a substantial context window is beneficial.