sminchoi/Llama-2-13b-chat-hf_guanaco-llama2-1k_230914_A40

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kArchitecture:Transformer Cold

The sminchoi/Llama-2-13b-chat-hf_guanaco-llama2-1k_230914_A40 model is a 13 billion parameter language model based on the Llama-2-13b-chat-hf architecture, fine-tuned by sminchoi. It was trained using 4-bit quantization with the bitsandbytes library, specifically employing the nf4 quantization type. This model is designed for chat-based applications, leveraging its Llama-2 foundation and fine-tuning process to generate conversational responses.

Loading preview...

Model Overview

The sminchoi/Llama-2-13b-chat-hf_guanaco-llama2-1k_230914_A40 is a 13 billion parameter language model built upon the Llama-2-13b-chat-hf architecture. This model has been fine-tuned by sminchoi, focusing on conversational capabilities.

Training Details

The model underwent training utilizing 4-bit quantization via the bitsandbytes library. Key quantization parameters include:

  • load_in_4bit: True
  • bnb_4bit_quant_type: nf4
  • bnb_4bit_compute_dtype: float16

This quantization approach aims to optimize memory usage during training and inference while maintaining performance. The training process also leveraged PEFT (Parameter-Efficient Fine-Tuning) version 0.6.0.dev0.

Intended Use

Given its Llama-2-chat foundation and fine-tuning, this model is primarily suited for:

  • Chat applications: Engaging in conversational dialogues.
  • Instruction following: Responding to user prompts and instructions in a chat format.