Name: lvkaokao/llama2-7b-hf-chat-lora API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: lvkaokao

Model Overview

The lvkaokao/llama2-7b-hf-chat-lora is a 7 billion parameter language model built upon the Llama 2 architecture. This model has been fine-tuned specifically for chat-based applications, making it suitable for conversational AI tasks.

Key Technical Details

Base Model: Llama 2
Parameter Count: 7 billion
Context Length: 4096 tokens
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Quantization: Utilizes bitsandbytes for 4-bit quantization (nf4 type) with double quantization enabled, and bfloat16 compute dtype. This configuration aims to reduce memory footprint and improve inference efficiency while maintaining performance.
Framework: PEFT (Parameter-Efficient Fine-Tuning) version 0.4.0 was used during the training process.

Intended Use Cases

This model is primarily intended for:

Developing conversational agents and chatbots.
Applications requiring efficient, Llama 2-based chat capabilities.
Scenarios where reduced memory usage during inference is critical due to 4-bit quantization.

Overview

Model Overview

Key Technical Details

Intended Use Cases

Full Model Card (README)