Name: HaroldB/LLama-2-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: HaroldB

HaroldB/LLama-2-7B Overview

This model is a 7 billion parameter variant of the Llama 2 architecture, specifically fine-tuned with a focus on efficient resource utilization. It employs bitsandbytes quantization during its training process, utilizing a bnb_4bit_quant_type of nf4 and bnb_4bit_compute_dtype set to float16.

Key Characteristics

Architecture: Llama 2 base model.
Parameter Count: 7 billion parameters.
Quantization: Trained with 4-bit quantization (load_in_4bit: True), enabling reduced memory consumption during inference.
Context Length: Supports a context window of 4096 tokens.
Framework: Utilizes PEFT (Parameter-Efficient Fine-Tuning) version 0.5.0.dev0.

Good For

Applications requiring a Llama 2-based model with a smaller memory footprint.
General language generation and understanding tasks where computational efficiency is prioritized.
Environments with limited GPU memory, benefiting from 4-bit quantization.

Overview

HaroldB/LLama-2-7B Overview

Key Characteristics

Good For

Full Model Card (README)