Name: Maxtra/llama-2-7b-frestival API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Maxtra

Maxtra/llama-2-7b-frestival Overview

This model is a variant of the Llama-2-7b architecture, developed by Maxtra. It has been fine-tuned with a specific quantization configuration to optimize its performance and resource usage. The training process utilized bitsandbytes for 4-bit quantization, specifically employing the nf4 quantization type and float16 for compute dtype.

Key Training Details

Quantization: The model was trained with load_in_4bit: True and bnb_4bit_quant_type: nf4.
Compute Dtype: bnb_4bit_compute_dtype was set to float16.
Framework: PEFT version 0.4.0 was used during the training procedure.

Potential Use Cases

This model is suitable for developers looking to leverage a Llama-2-7b base model with pre-applied 4-bit quantization, which can be beneficial for:

Resource-constrained environments: The 4-bit quantization can reduce memory footprint.
Efficient deployment: Optimized for faster inference on compatible hardware.
Further fine-tuning: Provides a quantized base for additional domain-specific adaptations.

Overview

Maxtra/llama-2-7b-frestival Overview

Key Training Details

Potential Use Cases

Full Model Card (README)