Name: umd-zhou-lab/claude2-alpaca-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: umd-zhou-lab

Overview

The umd-zhou-lab/claude2-alpaca-7B is a 7 billion parameter auto-regressive language model developed by the UMD Tianyi Zhou Lab. It is built upon the Llama-2-7b architecture and has been instruction-tuned using a dataset derived from Claude 2, known as claude2_alpaca. This model is primarily intended for research purposes in large language models and chatbot development.

Key Capabilities & Training

Base Model: Fine-tuned from meta-llama/Llama-2-7b.
Instruction Tuning: Utilizes a unique instruction-tuning dataset, claude2_alpaca, which is distilled from Claude 2's responses.
Training Parameters: Trained for 3 epochs with a global batch size of 128, a learning rate of 2e-5, and a maximum sequence length of 4096 tokens.

Performance

Compared to the base Llama-2-7b-chat model, claude_alpaca-7b shows an improvement in average performance across several benchmarks. Notably, it achieves higher scores in ARC (56.66 vs 52.9) and HellaSwag (81.17 vs 78.55), indicating enhanced reasoning and common-sense capabilities. While its MMLU score is slightly lower, it maintains competitive performance in TruthfulQA and Alpaca_Eval.

Intended Use Cases

This model is suitable for:

Research: Exploring the effects of instruction-tuning with Claude 2-derived data on Llama-2 models.
Chatbot Development: As a foundation for building and experimenting with conversational AI systems.
Hobbyist Projects: For individuals interested in natural language processing and artificial intelligence experimentation.

Overview

Overview

Key Capabilities & Training

Performance

Intended Use Cases

Full Model Card (README)