henrikclh/llama-2-7b-Arch1
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold
The henrikclh/llama-2-7b-Arch1 model is a 7 billion parameter language model based on the Llama 2 architecture. It was trained using 4-bit quantization with the bitsandbytes library, specifically utilizing nf4 quantization and float16 compute dtype. This model is suitable for general language generation tasks where efficient resource usage during training is a priority.
Loading preview...
Model Overview
The henrikclh/llama-2-7b-Arch1 is a 7 billion parameter model built upon the Llama 2 architecture. Its training process leveraged bitsandbytes for efficient quantization, specifically using 4-bit nf4 quantization with float16 for computation. This approach allows for reduced memory footprint during the training phase.
Key Training Details
- Quantization: Utilizes
bitsandbyteswithload_in_4bit: Trueandbnb_4bit_quant_type: nf4. - Compute Dtype:
bnb_4bit_compute_dtypewas set tofloat16. - Framework: Trained with PEFT version 0.4.0.
Good For
- Experimenting with Llama 2 based models that have undergone 4-bit quantization during training.
- Use cases where the training methodology, particularly the
bitsandbytesconfiguration, is of interest. - General language generation tasks, given its Llama 2 foundation.