Name: hishab/titulm-llama-3.2-1b-v1.0 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: hishab

Overview

hishab/titulm-llama-3.2-1b-v1.0 is a 1.23 billion parameter model based on the Llama 3.2 architecture, continually pre-trained by Hishab. Its primary differentiator is its specialization in the Bangla language, achieved through extensive pretraining on a curated 268 GB Bangla text corpus, totaling 6 billion tokens. This process significantly enhances its ability to generate high-quality Bangla text and improve understanding.

Key Capabilities

Superior Bangla Text Generation: Optimized for producing fluent and contextually relevant Bangla text.
Enhanced Bangla Language Understanding: Demonstrates improved performance on various Bangla evaluation benchmarks.
Multilingual Support: Primarily supports Bengali, with secondary capabilities in English.
Grouped-Query Attention (GQA): Utilizes GQA for improved inference scalability, a feature inherited from the Llama 3.2 family.

Good for

Bangla Text Generation: Ideal for applications requiring natural and accurate text output in Bengali.
Bangla Language Understanding Tasks: Suitable for tasks like question answering, summarization, and sentiment analysis in Bengali.
Bangla Instruction Fine-tuning: Can be further fine-tuned for specific instruction-following tasks in the Bangla language.

Performance Highlights

Compared to the base llama-3.2-1b model, titulm-llama-3.2-1b-v1.0 shows stronger performance on several Bangla benchmarks:

Achieves 0.31 in Commonsense QA BN (5-shot) compared to 0.23.
Scores 0.34 in OpenBook QA BN (5-shot) compared to 0.31.
Reaches 0.57 in PIQA BN (5-shot) compared to 0.54.

It is important to note that while excelling in Bangla, its English benchmark scores are generally lower than the base llama-3.2-1b model, as expected due to its specialized Bangla training.