flammenai/Mahou-1.3-gemma2-9B

TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Jul 2, 2024License:gemmaArchitecture:Transformer0.0K Cold

Mahou-1.3-gemma2-9B is a 9 billion parameter language model developed by flammenai, based on the Gemma 2 architecture. It is specifically fine-tuned for conversational contexts, excelling at generating short messages, casual conversation, and character roleplay. This model is optimized for interactive chat applications requiring expressive and character-driven responses.

Loading preview...

Mahou-1.3-gemma2-9B: Conversational and Roleplay Optimized

Mahou-1.3-gemma2-9B is a 9 billion parameter model developed by flammenai, specifically engineered for dynamic and engaging conversational interactions. It is built upon the Gemma 2 architecture and has been fine-tuned to deliver concise, character-rich responses, making it particularly adept at casual conversation and character roleplay scenarios.

Key Capabilities

  • Conversational Fluency: Designed to produce short, natural-sounding messages in chat environments.
  • Character Roleplay: Excels at adopting and maintaining character personas, utilizing a specific format for speech (without quotes) and actions (in asterisks).
  • ChatML Format: Trained to use the ChatML format, ensuring compatibility with modern chat interfaces and providing clear structure for system, character, and user messages.

Training Details

The model underwent 3 epochs of fine-tuning on an A100 GPU using Google Colab, leveraging techniques similar to those described in the "Fine-tune Llama 3 with ORPO" methodology by Maxime Labonne.

Recommended Use Cases

  • Interactive Chatbots: Ideal for applications requiring engaging and personality-driven conversational agents.
  • Roleplay Simulations: Suited for generating character dialogue and actions in interactive storytelling or virtual assistant roles.
  • SillyTavern Integration: Optimized for use with SillyTavern, with specific settings and a dedicated preset provided for seamless integration and enhanced performance in roleplay contexts.