titan087/OpenLlama13B-Guanaco

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Jun 26, 2023Architecture:Transformer0.0K Cold

titan087/OpenLlama13B-Guanaco is a 13 billion parameter Open Llama model fine-tuned using QLoRA on the Guanaco dataset, featuring a 4096-token context length. This model is optimized for general conversational tasks, demonstrating a balanced performance across various benchmarks. It is suitable for applications requiring a capable and efficient language model for instruction-following and text generation.

Loading preview...

Model Overview

titan087/OpenLlama13B-Guanaco is a 13 billion parameter language model based on the Open Llama architecture. It has been fine-tuned using the QLoRA method on the timdettmers/openassistant-guanaco dataset, which is known for its high-quality conversational data. This fine-tuning process aims to enhance the model's instruction-following capabilities and general conversational fluency.

Key Capabilities & Performance

The model demonstrates a balanced performance across several benchmarks, as evaluated on the Open LLM Leaderboard. Key scores include:

  • Average Score: 41.32
  • ARC (25-shot): 51.19
  • HellaSwag (10-shot): 75.24
  • MMLU (5-shot): 43.76
  • TruthfulQA (0-shot): 38.4
  • Winogrande (5-shot): 71.74

While its scores on reasoning tasks like GSM8K (2.96) and DROP (5.96) are lower, its performance on general knowledge and common sense benchmarks like HellaSwag and Winogrande is notable for its size class.

Good For

  • General-purpose conversational AI: Its fine-tuning on the Guanaco dataset makes it suitable for chatbots and interactive applications.
  • Instruction-following tasks: The model is designed to respond effectively to user instructions.
  • Resource-efficient deployment: As a QLoRA fine-tuned model, it can be more efficient to deploy and run compared to larger, unoptimized models.