ajibawa-2023/Scarlett-Llama-3-8B-v1.0

Warm
Public
8B
FP8
8192
1
License: llama3
Hugging Face

Scarlett-Llama-3-8B-v1.0 by ajibawa-2023 is an 8 billion parameter Llama-3-based language model, fine-tuned for generating human-like, longer, and deeper conversations across diverse topics including philosophy, advice, jokes, and coding. This updated version addresses repetition issues found in its predecessor. It excels at conversational AI, providing engaging and coherent dialogue for various interactive applications.

Overview

Overview

ajibawa-2023/Scarlett-Llama-3-8B-v1.0 is an 8 billion parameter language model built upon Meta's Llama-3-8B architecture. This model is a refined iteration of the earlier Scarlett-Llama-3-8B, specifically addressing and resolving repetition issues to enhance conversational flow. It was trained on a comprehensive dataset comprising over 10,000 conversation sets, each containing 10-15 turns, covering a wide array of subjects such as philosophy, advice, humor, and programming.

Key Capabilities

  • Human-like Conversation Generation: Designed to produce natural and engaging dialogues.
  • Extended Conversational Depth: Capable of maintaining longer and more profound conversations.
  • Diverse Topic Coverage: Proficient in discussing various subjects including philosophy, advice, jokes, and coding.
  • Repetition Mitigation: Improved over its previous version to reduce repetitive outputs.
  • Instruction Following: Utilizes the ChatML prompt format for clear instruction-tuned responses.

Performance & Training

The model was fully fine-tuned using the Axolotl codebase on 4 x A100 80GB GPUs, completing 3 epochs in over 2 hours. On the Open LLM Leaderboard, Scarlett-Llama-3-8B-v1.0 achieved an average score of 64.92, with notable results including 83.98 on HellaSwag (10-Shot) and 66.36 on MMLU (5-Shot).

Good For

  • Applications requiring engaging and extended conversational AI.
  • Chatbots or virtual assistants needing to discuss diverse topics.
  • Interactive systems where natural, human-like dialogue is crucial.
  • Use cases benefiting from a model with reduced conversational repetition.