Name: jsyeom/llama-2-13b-hf-smooth API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jsyeom

jsyeom/llama-2-13b-hf-smooth: A Smoothed Llama 2-13B Model

This model, developed by jsyeom, is a 13 billion parameter variant of the meta-llama/Llama-2-13b-hf base model. Its primary distinguishing feature is the application of SmoothQuant smoothing to its internal activations. Crucially, no quantization has been applied; the model retains its full precision.

Key Characteristics

Base Model: meta-llama/Llama-2-13b-hf
Smoothing Technique: SmoothQuant smoothing applied to activations.
Smoothing Alpha: Configured with an alpha value of 0.85, indicating the migration strength.
Act Scales Source: Utilizes mit-han-lab/smoothquant-scales for activation scales.
Precision: Remains a full-precision model, as only smoothing, not quantization, has been performed.

Purpose and Potential Use Cases

This model is particularly relevant for researchers and developers exploring quantization-aware training or post-training quantization (PTQ). By providing a smoothed version of Llama 2-13B, it offers a pre-processed foundation that can potentially lead to better performance when subsequently quantized to lower bit-widths. It allows for experimentation with the benefits of SmoothQuant without the immediate performance impact of quantization, making it a valuable intermediate step in the optimization pipeline for efficient deployment of large language models.

Overview

jsyeom/llama-2-13b-hf-smooth: A Smoothed Llama 2-13B Model

Key Characteristics

Purpose and Potential Use Cases

Full Model Card (README)