Name: kstecenko/Tinny-LLAMA2-extractor API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kstecenko

Overview

The kstecenko/Tinny-LLAMA2-extractor is a model developed by kstecenko, distinguished by its training procedure which heavily utilizes bitsandbytes 4-bit quantization. This approach is designed to optimize the model's efficiency, likely targeting deployment in resource-constrained environments.

Key Training Details

Quantization: The model was trained with load_in_4bit: True, employing the nf4 quantization type and bnb_4bit_use_double_quant: True for enhanced precision within the 4-bit scheme.
Compute Data Type: bfloat16 was used as the compute data type during 4-bit quantization, balancing numerical stability with performance.
Framework: The training process leveraged PEFT (Parameter-Efficient Fine-Tuning) version 0.6.0.dev0, indicating a focus on efficient adaptation rather than full model retraining.

Potential Use Cases

Resource-constrained environments: The 4-bit quantization makes this model potentially suitable for deployment on devices with limited memory or computational power.
Efficient fine-tuning: The use of PEFT suggests it's designed for scenarios where rapid and efficient adaptation to new tasks is desired without extensive computational overhead.

Overview

Overview

Key Training Details

Potential Use Cases

Full Model Card (README)