Name: wang7776/vicuna-7b-v1.3-sparsity-20 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: wang7776

Overview

This model, wang7776/vicuna-7b-v1.3-sparsity-20, is a 7 billion parameter variant of the Vicuna v1.3 chat assistant, developed by LMSYS. It is fine-tuned from the LLaMA architecture using supervised instruction fine-tuning on approximately 125K conversations from ShareGPT.com.

Key Differentiator: Sparsity

What sets this model apart is its 20% sparsity, achieved through the Wanda pruning method. This technique allows for significant model compression without requiring retraining or weight updates, aiming to preserve performance while reducing computational overhead.

Capabilities & Use Cases

Chat Assistant: Designed to function as a conversational AI, fine-tuned on real user-shared dialogues.
Research & Development: Primarily intended for researchers and hobbyists exploring large language models and chatbot technologies.
Efficient Deployment: The pruned nature of this model makes it potentially more efficient for deployment in resource-constrained environments compared to its dense counterpart, while still offering competitive performance.

Getting Started

Users can interact with the model via command-line interfaces or through APIs compatible with OpenAI and Hugging Face. Further details on its evaluation, including standard benchmarks, human preference, and LLM-as-a-judge methods, are available in the associated paper and leaderboard.

Overview

Overview

Key Differentiator: Sparsity

Capabilities & Use Cases

Getting Started

Full Model Card (README)