Name: PrunaAI/Llama-3.2-1b-Instruct-smashed API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: PrunaAI

Overview

PrunaAI/Llama-3.2-1b-Instruct-smashed is a 1 billion parameter instruction-tuned model built upon the Llama 3.2 architecture. Developed by PrunaAI, this model leverages the pruna library, an optimization framework designed to create more efficient models. A key characteristic of this model is its optimization through unstructured L1 pruning, achieving 10% sparsity, which reduces its size and potentially improves inference speed while retaining a substantial 32768 token context length.

Key Capabilities

Optimized Performance: Utilizes the pruna library for model compression, specifically through unstructured L1 pruning with 10% sparsity.
Efficient Deployment: Designed for scenarios requiring smaller, more efficient models due to its optimization.
Instruction Following: As an instruction-tuned model, it is capable of understanding and executing user commands.
Extended Context: Supports a 32768 token context window, allowing for processing longer inputs.

Good For

Resource-Constrained Environments: Ideal for applications where computational resources or memory are limited.
Edge Device Deployment: Suitable for deployment on devices with less powerful hardware due to its optimized footprint.
Rapid Prototyping: Enables quicker iteration and testing with a smaller, faster model.
Applications requiring efficient Llama 3.2-based inference: Provides an optimized alternative to larger Llama 3.2 models for specific tasks.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)