BobaZooba/Shurale7B-v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Shurale7B-v1 is a 7 billion parameter open-domain dialogue model developed by BobaZooba, based on Mistral-7B-v0.1. It is specifically designed for narrative-based chit-chat conversations, capable of establishing character and situation within a dialogue. The model was trained on 1.1 million dialogues from the SODA dataset, totaling 334 million tokens, with a maximum context length of 2048 tokens.

Loading preview...

Shurale7B-v1: Narrative-Based Chit-Chat Model

Shurale7B-v1, developed by BobaZooba, is a 7 billion parameter open-domain dialogue model built upon the Mistral-7B-v0.1 architecture. Its core differentiator is its specialization in narrative-based chit-chat, enabling it to establish a character and situation within conversations.

Key Capabilities & Training:

  • Narrative-driven dialogue: Excels at maintaining context and character based on an initial narrative prompt.
  • Dialogue generation: Generates conversational responses, even without an explicit narrative in 5% of cases (though not recommended).
  • Training data: Fine-tuned on 1.1 million dialogues from the SODA dataset.
  • Context length: Trained with a maximum sequence length of 2048 tokens.
  • Cost-efficient training: Achieved training for approximately $58 using advanced techniques like QLoRA (int4), DeepSpeed Stage 2, and gradient checkpointing over 45 hours on 8 RTX 3090 GPUs.

Use Cases:

  • Interactive text-based games: Demonstrated in the Tale Quest Telegram bot for dynamic AI characters.
  • Contextual chatbots: Ideal for applications requiring a bot to maintain a consistent persona or follow a specific scenario.
  • Dialogue simulation: Can be used to generate realistic conversations with defined characters and situations.

Limitations:

  • Bland and unnatural responses: Due to training on a synthetic dataset, responses can sometimes lack vibrancy.
  • Short conversations: Tends to conclude dialogues quickly.
  • Instruction following: Not explicitly trained for instruction following.
  • Truthfulness: Not tested for factual accuracy, likely lags behind models like OpenAI's.