Dans-TotSirocco-7b: A Prototype for Interactive AI
Dans-TotSirocco-7b is a 7 billion parameter language model built upon the Mistral-7b architecture, developed by Dans-Archive. This model serves as a prototype for "Dan's PersonalityEngine Mk. 2," focusing on versatile conversational and narrative generation capabilities.
Key Capabilities
- Multipurpose Chat/Chat Instruct Hybrid: Designed to handle both general chat and specific instruction-based interactions.
- Rich Narrative Generation: Trained extensively on role-playing scenarios and text adventure games, enabling it to produce descriptive and captivating stories with an abundance of sensory details and eloquent prose.
- Instruction Following: Proficient in responding to one-shot and multi-round instructions, adapting its output based on system messages and user prompts.
- Flexible Prompt Format: Utilizes the Pygmalion / Metharme prompt format, supporting various conversational flows including system messages, user inputs, and model responses.
Training Details
- Base Model: Mistral-7b
- Sequence Length: 4096 tokens
- Training Method: QLoRA, utilizing 2x RTX 4090 GPUs for approximately 4 hours.
- Key Dataset: Incorporates the Skein dataset from the Kobold AI community, which is crucial for its text adventure capabilities.
Performance Highlights
Evaluations on the Open LLM Leaderboard show an average score of 56.92, with notable results in:
- HellaSwag (10-shot): 84.23
- MMLU (5-shot): 64.19
- ARC (25-shot): 62.03
Good For
- Creative Writing & Story Generation: Ideal for applications requiring detailed and imaginative narratives, especially text-based adventures.
- Role-Playing & Conversational AI: Suitable for developing AI personalities that can engage in multi-turn dialogues and adopt specific personas.
- Instruction-Based Tasks: Effective for scenarios where the model needs to follow specific commands or act as an assistant.