Name: kitft/Llama-3.3-70B-NLA-L53-av API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: kitft

Model Overview

kitft/Llama-3.3-70B-NLA-L53-av is the Activation Verbalizer (AV) component of a Natural Language Autoencoder (NLA) pair, derived from Meta's Llama-3.3-70B-Instruct. This 70 billion parameter model is an interpretability tool designed to map hidden-state vectors from an LLM's residual stream (specifically, block 53) into natural-language descriptions.

It is intended to be used in conjunction with its paired Activation Reconstructor (AR) model, kitft/Llama-3.3-70B-NLA-L53-ar. Together, NLA pairs allow for the unsupervised explanation of LLM activations by converting internal representations into human-readable text and back.

Key Characteristics

Purpose-built for Interpretability: This model is not a general-purpose language model; its fine-tuning entirely repurposes it for activation decoding.
NLA Component: Functions as the vector-to-text half of an NLA system, providing natural language descriptions of internal LLM states.
Performance: Achieves an in-distribution fve_nrm of 0.80 on its training data (50/50 WildChat + Ultra-FineWeb).
Architecture: Fine-tuned from Llama-3.3-70B-Instruct, focusing on the residual stream output of block 53.

Intended Use Case

This model is specifically for researchers and developers working on LLM interpretability, particularly those interested in understanding and explaining the internal activations of large language models. It provides a method to verbalize what a specific hidden-state vector "means" within the model's processing.

Overview

Model Overview

Key Characteristics

Intended Use Case

Full Model Card (README)