Name: sminchoi/llama-2-7b-chat-hf_guanaco-llama2-1k_230913 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sminchoi

Model Overview

sminchoi/llama-2-7b-chat-hf_guanaco-llama2-1k_230913 is a 7 billion parameter model built upon the Llama 2 architecture. It has been fine-tuned specifically for chat applications, utilizing the Guanaco dataset for conversational capabilities. The model supports a context length of 4096 tokens, enabling it to handle moderately long dialogues.

Training Details

The training process for this model incorporated bitsandbytes 4-bit quantization. Key quantization parameters included:

load_in_4bit: True
bnb_4bit_quant_type: nf4
bnb_4bit_use_double_quant: False
bnb_4bit_compute_dtype: float16

This quantization strategy aims to reduce memory footprint and improve inference efficiency while maintaining performance. The training also utilized PEFT (Parameter-Efficient Fine-Tuning) version 0.4.0.

Potential Use Cases

Given its chat-oriented fine-tuning and efficient quantization, this model is well-suited for:

Conversational AI: Developing chatbots and virtual assistants.
Interactive applications: Powering dialogue systems where efficient inference is beneficial.
Resource-constrained environments: Deploying on hardware with limited memory due to its 4-bit quantization.

Overview

Model Overview

Training Details

Potential Use Cases

Full Model Card (README)