psychopenguin/indian_legal_llama3.2-3b-instruct
The psychopenguin/indian_legal_llama3.2-3b-instruct model, developed by Meta, is a 3.21 billion parameter instruction-tuned Llama 3.2 model. It utilizes an optimized transformer architecture with Grouped-Query Attention (GQA) and a 32k token context length. This model is specifically optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks, and supports languages such as English, Hindi, German, French, Italian, Portuguese, Spanish, and Thai.
Loading preview...
Model Overview
This model is an instruction-tuned variant from Meta's Llama 3.2 collection, featuring 3.21 billion parameters and a 32,768 token context length. It is built on an optimized transformer architecture, incorporating Grouped-Query Attention (GQA) for enhanced inference scalability. The model has been fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety, particularly in multilingual contexts.
Key Capabilities
- Multilingual Dialogue: Optimized for conversational AI in English, Hindi, German, French, Italian, Portuguese, Spanish, and Thai.
- Agentic Applications: Designed for tasks like knowledge retrieval, summarization, and mobile AI-powered writing assistants.
- Quantization Support: Includes quantized versions (SpinQuant and QLoRA) for efficient deployment in constrained environments like mobile devices, demonstrating significant improvements in decode speed and reduced memory footprint.
- Robust Safety Alignment: Incorporates comprehensive safety fine-tuning, including handling refusals and tone, and is intended to be deployed with additional system safeguards.
Good For
- Developing assistant-like chat applications requiring multilingual support.
- Implementing agentic systems for information retrieval and summarization.
- Deploying LLM capabilities on devices with limited compute resources, leveraging its optimized quantized versions.
- Researching safety fine-tuning and robust model deployment strategies.