BobaZooba/Shurale7B-v1
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold
Shurale7B-v1 is a 7 billion parameter open-domain dialogue model developed by BobaZooba, based on Mistral-7B-v0.1. It is specifically designed for narrative-based chit-chat conversations, capable of establishing character and situation within a dialogue. The model was trained on 1.1 million dialogues from the SODA dataset, totaling 334 million tokens, with a maximum context length of 2048 tokens.
Loading preview...
Shurale7B-v1: Narrative-Based Chit-Chat Model
Shurale7B-v1, developed by BobaZooba, is a 7 billion parameter open-domain dialogue model built upon the Mistral-7B-v0.1 architecture. Its core differentiator is its specialization in narrative-based chit-chat, enabling it to establish a character and situation within conversations.
Key Capabilities & Training:
- Narrative-driven dialogue: Excels at maintaining context and character based on an initial narrative prompt.
- Dialogue generation: Generates conversational responses, even without an explicit narrative in 5% of cases (though not recommended).
- Training data: Fine-tuned on 1.1 million dialogues from the SODA dataset.
- Context length: Trained with a maximum sequence length of 2048 tokens.
- Cost-efficient training: Achieved training for approximately $58 using advanced techniques like QLoRA (int4), DeepSpeed Stage 2, and gradient checkpointing over 45 hours on 8 RTX 3090 GPUs.
Use Cases:
- Interactive text-based games: Demonstrated in the
Tale QuestTelegram bot for dynamic AI characters. - Contextual chatbots: Ideal for applications requiring a bot to maintain a consistent persona or follow a specific scenario.
- Dialogue simulation: Can be used to generate realistic conversations with defined characters and situations.
Limitations:
- Bland and unnatural responses: Due to training on a synthetic dataset, responses can sometimes lack vibrancy.
- Short conversations: Tends to conclude dialogues quickly.
- Instruction following: Not explicitly trained for instruction following.
- Truthfulness: Not tested for factual accuracy, likely lags behind models like OpenAI's.