henrikclh/llama-2-7b-Arch1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

The henrikclh/llama-2-7b-Arch1 model is a 7 billion parameter language model based on the Llama 2 architecture. It was trained using 4-bit quantization with the bitsandbytes library, specifically utilizing nf4 quantization and float16 compute dtype. This model is suitable for general language generation tasks where efficient resource usage during training is a priority.

Loading preview...

Model Overview

The henrikclh/llama-2-7b-Arch1 is a 7 billion parameter model built upon the Llama 2 architecture. Its training process leveraged bitsandbytes for efficient quantization, specifically using 4-bit nf4 quantization with float16 for computation. This approach allows for reduced memory footprint during the training phase.

Key Training Details

  • Quantization: Utilizes bitsandbytes with load_in_4bit: True and bnb_4bit_quant_type: nf4.
  • Compute Dtype: bnb_4bit_compute_dtype was set to float16.
  • Framework: Trained with PEFT version 0.4.0.

Good For

  • Experimenting with Llama 2 based models that have undergone 4-bit quantization during training.
  • Use cases where the training methodology, particularly the bitsandbytes configuration, is of interest.
  • General language generation tasks, given its Llama 2 foundation.