MRNH/llama-2-13b-chat-hf
MRNH/llama-2-13b-chat-hf is a 13 billion parameter Llama 2-based conversational language model developed by MRNH. This model is specifically fine-tuned for chat applications, leveraging a 4096-token context length. Its training incorporated 4-bit quantization using bitsandbytes, making it efficient for deployment while maintaining performance for interactive dialogue.
Loading preview...
Model Overview
MRNH/llama-2-13b-chat-hf is a 13 billion parameter language model built upon the Llama 2 architecture, specifically designed for chat-based interactions. It supports a context length of 4096 tokens, enabling it to handle moderately long conversational turns.
Training Details
This model was trained using bitsandbytes 4-bit quantization, specifically employing the nf4 quantization type with float16 compute dtype. This approach allows for efficient memory usage during training and inference, making it suitable for environments with resource constraints. The training process utilized PEFT version 0.5.0.
Key Characteristics
- Base Model: Llama 2
- Parameter Count: 13 billion
- Context Window: 4096 tokens
- Quantization: Trained with
bitsandbytes4-bit quantization (nf4type,float16compute dtype)
Use Cases
This model is well-suited for applications requiring:
- Conversational AI: Engaging in dialogue, answering questions, and generating human-like text in a chat format.
- Resource-Efficient Deployment: Its 4-bit quantization makes it a candidate for deployment on hardware with limited memory, while still offering the capabilities of a 13B parameter model.