grimjim/Magrathic-12B
Magrathic-12B is a 12 billion parameter language model developed by grimjim, created through a Task Arithmetic merge of several base models, including grimjim/mistralai-Mistral-Nemo-Base-2407 and grimjim/Magnolia-Mell-v1-12B. It incorporates cgato/Nemo-12b-Humanize-SFT-v0.2.5-KTO at a low weight to enhance human-like text generation. This model is designed for general text generation tasks, with a particular focus on producing more humanized and natural language outputs, and supports a 32768 token context length.
Loading preview...
Magrathic-12B Overview
Magrathic-12B is a 12 billion parameter language model developed by grimjim, built upon the Mistral-Nemo-Base-2407 architecture. It was created using the Task Arithmetic merge method, combining several pre-trained models to achieve its specific characteristics. The primary goal of this merge was to enhance the model's ability to generate text that is more human-like and natural in its phrasing.
Key Capabilities
- Humanized Text Generation: A small weight from
cgato/Nemo-12b-Humanize-SFT-v0.2.5-KTOwas specifically included to perturb outputs towards more human-like text. - Merged Architecture: Integrates capabilities from
grimjim/Magnolia-Mell-v1-12Bandgrimjim/mistralai-Mistral-Nemo-Base-2407. - Context Length: Supports a substantial context window of 32768 tokens, allowing for processing longer inputs and generating coherent extended responses.
- Prompt Format Compatibility: Utilizes the prompt format established by the original Nemo Instruct 2407 12B model.
When to Use Magrathic-12B
This model is particularly well-suited for applications requiring:
- General Text Generation: Creating diverse and coherent text across various topics.
- Enhanced Naturalness: Scenarios where the output needs to sound more conversational and less robotic.
- Creative Writing: Generating stories, dialogues, or other creative content where human-like expression is valued.
- Long-form Content: Leveraging its large context window for tasks involving extensive input or output.