Name: umd-zhou-lab/claude2-alpaca-13B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: umd-zhou-lab

Model Overview

umd-zhou-lab/claude2-alpaca-13B is an instruction-tuned language model developed by the UMD Tianyi Zhou Lab. It is based on the Llama-2-13b architecture and has been fine-tuned using a unique dataset derived from Claude 2 Alpaca data. This approach aims to leverage the strengths of Claude 2's instruction-following capabilities to improve the Llama-2 base model.

Key Capabilities & Performance

This model demonstrates enhanced performance compared to its Llama-2-13b-chat counterpart across several benchmarks. Specifically, the claude_alpaca-13b model shows:

Improved Average Score: Achieves an average score of 61.29, surpassing Llama-2-13b-chat's 59.935.
Higher ARC Score: 61.18 vs. 59.04 for Llama-2-13b-chat.
Better HellaSwag Performance: 84.08 vs. 81.94 for Llama-2-13b-chat.
Slightly Higher MMLU Score: 55.74 vs. 54.64 for Llama-2-13b-chat.

Training Details

The model was fine-tuned from meta-llama/Llama-2-13b using the prompt format from Stanford Alpaca. Key training hyperparameters included a global batch size of 128, a learning rate of 1e-5, 5 epochs, a maximum length of 2048, and no weight decay.

Intended Use

The primary use case for this model is research on large language models and chatbots. It is designed for researchers and hobbyists in natural language processing, machine learning, and artificial intelligence who are exploring instruction-tuning methodologies and their impact on model performance.

Overview

Model Overview

Key Capabilities & Performance

Training Details

Intended Use

Full Model Card (README)