Nohobby/MS3-Tantum-24B-v0.1 Overview
Nohobby/MS3-Tantum-24B-v0.1 is a 24 billion parameter merged language model built upon the Mistral-Small-24B architecture. Developed by Nohobby, this model is designed with a focus on enhancing prose generation, character adherence, and roleplay capabilities.
Key Capabilities and Features
- Character Adherence and Roleplay: The model is noted for its ability to maintain consistent character traits and excel in roleplay contexts, with a related model, MS3-RP-Broth-24B, also available.
- Internal Monologue with
<think> tags: A distinctive feature is its consistent use of <think> tags for internal monologues when prompted, offering a unique way to simulate a character's thoughts. - Merge Architecture: Tantum is the result of a complex multi-step merge process using a custom tool called
shardmerge, combining various Mistral-Small-24B based models and other specialized models like Undi95/MistralThinker-e2 and arcee-ai/Arcee-Blitz to achieve its specific characteristics. - Prompt Format Flexibility: While the merge process involved Mistral-V7, the creator suggests that ChatML and Llama3 prompt formats might yield better results.
Recommended Use Cases
- Creative Writing: Particularly for generating prose and narratives where character consistency is crucial.
- Roleplay Scenarios: Ideal for applications requiring detailed and immersive roleplay interactions.
- Simulating Internal Thought Processes: Leveraging the
<think> tag for applications that benefit from explicit internal monologues.