Dans-Archive/Dans-TotSirocco-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Dans-Archive/Dans-TotSirocco-7b is a 7 billion parameter chat/chat instruct hybrid model based on Mistral-7b, developed by Dans-Archive. It is designed as a prototype for Dan's PersonalityEngine Mk. 2, trained on diverse one-shot and multi-round instructions, role-playing scenarios, and text adventure games. This model excels in generating engaging, descriptive narratives and handling various conversational tasks, making it suitable for interactive AI applications.

Loading preview...

Dans-TotSirocco-7b: A Prototype for Interactive AI

Dans-TotSirocco-7b is a 7 billion parameter language model built upon the Mistral-7b architecture, developed by Dans-Archive. This model serves as a prototype for "Dan's PersonalityEngine Mk. 2," focusing on versatile conversational and narrative generation capabilities.

Key Capabilities

  • Multipurpose Chat/Chat Instruct Hybrid: Designed to handle both general chat and specific instruction-based interactions.
  • Rich Narrative Generation: Trained extensively on role-playing scenarios and text adventure games, enabling it to produce descriptive and captivating stories with an abundance of sensory details and eloquent prose.
  • Instruction Following: Proficient in responding to one-shot and multi-round instructions, adapting its output based on system messages and user prompts.
  • Flexible Prompt Format: Utilizes the Pygmalion / Metharme prompt format, supporting various conversational flows including system messages, user inputs, and model responses.

Training Details

  • Base Model: Mistral-7b
  • Sequence Length: 4096 tokens
  • Training Method: QLoRA, utilizing 2x RTX 4090 GPUs for approximately 4 hours.
  • Key Dataset: Incorporates the Skein dataset from the Kobold AI community, which is crucial for its text adventure capabilities.

Performance Highlights

Evaluations on the Open LLM Leaderboard show an average score of 56.92, with notable results in:

  • HellaSwag (10-shot): 84.23
  • MMLU (5-shot): 64.19
  • ARC (25-shot): 62.03

Good For

  • Creative Writing & Story Generation: Ideal for applications requiring detailed and imaginative narratives, especially text-based adventures.
  • Role-Playing & Conversational AI: Suitable for developing AI personalities that can engage in multi-turn dialogues and adopt specific personas.
  • Instruction-Based Tasks: Effective for scenarios where the model needs to follow specific commands or act as an assistant.