hishab/titulm-llama-3.2-3b-v1.1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Oct 6, 2024License:llama3.2Architecture:Transformer Warm

The hishab/titulm-llama-3.2-3b-v1.1 is a 3.2 billion parameter Llama 3.2 architecture model, continually pre-trained by Hishab on extensive Bangla datasets. This model is specifically optimized for high-quality Bangla text generation and understanding, demonstrating superior performance in Bangla language evaluation benchmarks. While primarily focused on Bengali, it also retains secondary English language capabilities. Its main use case is for applications requiring robust Bangla language processing.

Loading preview...

Model Overview

hishab/titulm-llama-3.2-3b-v1.1 is a 3.2 billion parameter language model based on the Meta Llama 3.2 architecture, developed by Hishab. This model underwent continual pre-training on a large, curated corpus of Bangla text data (37 billion tokens from 268 GB of data), significantly enhancing its proficiency in the Bengali language. It supports a context length of 32768 tokens and utilizes Grouped-Query Attention (GQA).

Key Capabilities & Differentiators

  • Superior Bangla Language Performance: Outperforms the base llama-3.2-3b model across multiple Bangla benchmark datasets, including BoolQ BN, Commonsense QA BN, OpenBook QA BN, and PIQA BN, in both 0-shot and 5-shot settings.
  • Extensive Bangla Training Data: Trained on a diverse Bangla dataset comprising web documents, books, transcribed text, translated text, code-mixed text, transliterated text, and synthetic data.
  • Bilingual Support: Primarily focused on Bengali, with secondary capabilities in English.
  • Llama 3.2 Architecture: Benefits from the optimized transformer architecture of the Llama 3.2 family.

Intended Use Cases

  • Bangla text generation
  • Bangla language understanding tasks
  • Bangla instruction fine-tuning tasks

While llama-3.2-3b generally leads in English benchmarks, hishab/titulm-llama-3.2-3b-v1.1 is specifically engineered and optimized for applications demanding high performance in the Bengali language.