Name: YeungNLP/firefly-llama-13b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: YeungNLP

Overview

YeungNLP/firefly-llama-13b is a 13 billion parameter language model built upon the Llama architecture. It was instruction-tuned using the QLoRA method on the extensive UltraChat dataset, which contains around 1.4 million multi-turn conversational data points. A key advantage of this model is its resource-efficient training, capable of being fine-tuned on a single GPU with as little as 16GB of VRAM, a significant reduction compared to full parameter fine-tuning approaches like those used for Vicuna-13B.

Performance

The model has been objectively evaluated on the 🤗 Hugging Face Open LLM Leaderboard, demonstrating strong performance relative to its peers. It scored 59.4 on the average benchmark, slightly outperforming vicuna-13b-1.1 and closely trailing Llama-2-13b-chat and vicuna-13b-v1.3. Its scores across various benchmarks, including ARC, HellaSwag, MMLU, and TruthfulQA (MC), indicate its general conversational and reasoning capabilities.

Model	Average	ARC	HellaSwag	MMLU	TruthfulQA (MC)
Llama-2-13b-chat-hf	59.9	59	81.9	54.6	44.1
firefly-llama-13b	59.4	59	79.7	49.1	49.6
vicuna-13b-1.1	59.2	52.7	80.1	51.9	52.1

Key Differentiators

Efficient Fine-tuning: Utilizes QLoRA, enabling fine-tuning of a 13B model with minimal hardware (e.g., 16GB VRAM).
Competitive Performance: Achieves benchmark scores comparable to larger or more resource-intensive models like Vicuna-13B and Llama-2-13B-chat.
Instruction-tuned: Optimized for multi-turn conversational tasks through training on the UltraChat dataset.

Use Cases

This model is suitable for applications requiring a capable 13B language model that can be deployed or further fine-tuned with limited computational resources. Its instruction-tuned nature makes it well-suited for conversational AI, chatbots, and general-purpose language generation tasks where performance close to leading 13B models is desired without the high training overhead.

Overview

Overview

Performance

Key Differentiators

Use Cases

Full Model Card (README)