ajibawa-2023/Young-Children-Storyteller-Mistral-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Apr 4, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The ajibawa-2023/Young-Children-Storyteller-Mistral-7B is a 7 billion parameter language model based on the Mistral-7B-v0.1 architecture, fine-tuned by ajibawa-2023. It is specifically designed for generating educational stories for young children aged 6 to 12, leveraging a dataset of over 0.9 million curated children's stories. This model excels at fostering creativity, empathy, and critical thinking through age-appropriate narrative generation, making it suitable for applications requiring child-friendly content.

Loading preview...

Model Overview

The ajibawa-2023/Young-Children-Storyteller-Mistral-7B is a 7 billion parameter language model, fine-tuned from Mistral-7B-v0.1. Its primary purpose is to generate engaging and educational stories specifically for young children aged 6 to 12. The model was trained on a unique dataset, "Children-Stories-Collection," comprising over 0.9 million stories tailored for this age group.

Key Capabilities

  • Specialized Story Generation: Designed to create narratives that foster creativity, empathy, and critical thinking in young children.
  • Age-Appropriate Content: Utilizes a meticulously curated dataset to ensure stories are suitable for children aged 6 to 12.
  • Educational Focus: Aims to serve as a companion for discovery and learning through storytelling.
  • Mistral-7B Base: Built upon the robust Mistral-7B-v0.1 architecture, providing a strong foundation for language understanding and generation.

Training Details

The model was trained for 3 epochs over 30 hours using 4 x A100 80GB GPUs, leveraging the Axolotl codebase. It is available as a qLoRA version, with GGUF quantizations provided by MarsupialAI.

Performance

Evaluations on the Open LLM Leaderboard show an average score of 71.08, with specific metrics including:

  • AI2 Reasoning Challenge (25-Shot): 68.69
  • HellaSwag (10-Shot): 84.67
  • MMLU (5-Shot): 64.11
  • TruthfulQA (0-shot): 62.62
  • Winogrande (5-shot): 81.22
  • GSM8k (5-shot): 65.20

Recommended Use Cases

This model is ideal for applications requiring the generation of imaginative and educational stories for young audiences, such as interactive storytelling apps, educational platforms, or content creation tools for children's literature.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p