Name: Abinesh/Llama-2_Vicuna_LoRA-13b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Abinesh

Model Overview

Abinesh/Llama-2_Vicuna_LoRA-13b is a 13 billion parameter language model built upon the robust Llama-2 architecture. It has been fine-tuned using the Vicuna dataset, a collection of user-shared conversations, which enhances its conversational capabilities. The model employs Low-Rank Adaptation (LoRA) for efficient fine-tuning, making it adaptable for various downstream tasks.

Technical Specifications

This model was trained with specific quantization configurations to optimize its size and inference speed:

Quantization Method: bitsandbytes 4-bit quantization (nf4 type).
Double Quantization: Enabled for further memory efficiency.
Compute Data Type: bfloat16 for numerical stability and performance.

These configurations allow the model to run effectively with reduced memory footprint while maintaining a good level of performance, making it suitable for environments with limited computational resources.

Training Frameworks

The fine-tuning process utilized:

PEFT: Version 0.4.0.dev0 for parameter-efficient fine-tuning.

Use Cases

This model is well-suited for:

Conversational AI: Engaging in dialogue, answering questions, and generating human-like text.
Resource-constrained deployments: Its 4-bit quantization makes it a viable option for applications where memory and computational power are limited.
Further fine-tuning: Can serve as a strong base model for domain-specific adaptations due to its LoRA fine-tuning.

Overview

Model Overview

Technical Specifications

Training Frameworks

Use Cases

Full Model Card (README)