Name: d-matrix/Llama3-8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: d-matrix

d-matrix/Llama3-8b Overview

d-matrix/Llama3-8b is an 8 billion parameter functional reference of the Llama 3 model, developed by d-Matrix. This model serves as a reference implementation to demonstrate and evaluate the capabilities of the d-Matrix Dmx_Compressor.

Key Configurations

The model is provided with two primary functional configurations:

BASELINE: This configuration is functionally equivalent to the original Llama 3 model, serving as an unquantized reference.
BASIC: In this configuration, all linear algebraic operands within the model are quantized to MXINT8-64, showcasing the effects of d-Matrix's compression technology.

Usage and Evaluation

Developers can integrate this model with the d-Matrix Dmx_Compressor to transform and evaluate its performance. The provided example demonstrates how to load the model using dmx.compressor.modeling.DmxModel.from_torch and evaluate it with lm_eval on tasks like "wikitext", allowing for direct comparison between the baseline and quantized versions.

Overview

d-matrix/Llama3-8b Overview

Key Configurations

Usage and Evaluation

Full Model Card (README)