Name: MRNH/llama-2-13b-chat-hf API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: MRNH

Model Overview

MRNH/llama-2-13b-chat-hf is a 13 billion parameter language model built upon the Llama 2 architecture, specifically designed for chat-based interactions. It supports a context length of 4096 tokens, enabling it to handle moderately long conversational turns.

Training Details

This model was trained using bitsandbytes 4-bit quantization, specifically employing the nf4 quantization type with float16 compute dtype. This approach allows for efficient memory usage during training and inference, making it suitable for environments with resource constraints. The training process utilized PEFT version 0.5.0.

Key Characteristics

Base Model: Llama 2
Parameter Count: 13 billion
Context Window: 4096 tokens
Quantization: Trained with bitsandbytes 4-bit quantization (nf4 type, float16 compute dtype)

Use Cases

This model is well-suited for applications requiring:

Conversational AI: Engaging in dialogue, answering questions, and generating human-like text in a chat format.
Resource-Efficient Deployment: Its 4-bit quantization makes it a candidate for deployment on hardware with limited memory, while still offering the capabilities of a 13B parameter model.

Overview

Model Overview

Training Details

Key Characteristics

Use Cases

Full Model Card (README)