Name: arif-butt/tinyllama-peft-merged API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: arif-butt

Overview

The arif-butt/tinyllama-peft-merged model is a 1.1 billion parameter language model based on the TinyLlama architecture. It has been fine-tuned using PEFT (Parameter-Efficient Fine-Tuning) LoRA and subsequently merged, making it a production-ready model that does not require separate PEFT adapters for inference. The model is provided in PyTorch Safetensors format, utilizing FP16 precision and supporting a context length of 2048 tokens.

Key Capabilities

Direct Inference: No PEFT setup is needed; simply load and use the model with standard Hugging Face AutoModelForCausalLM and AutoTokenizer classes.
Compact Size: At 1.1 billion parameters and 2.2 GB, it offers a balance between performance and resource efficiency.
Standard Prompt Format: Utilizes a clear "Q: ...\nA:" prompt structure for instruction-following tasks.
Configurable Generation: Supports common generation parameters like max_new_tokens, temperature, top_p, do_sample, and repetition_penalty.

Good For

Resource-constrained environments: Its smaller size makes it suitable for deployment where computational resources are limited.
General text generation: Capable of answering questions and generating coherent text based on provided prompts.
Rapid prototyping: The merged nature simplifies deployment, allowing for quick integration into applications.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)