Name: nischay185/konkani-qwen2-1.5b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: nischay185

Konkani Qwen2-1.5B: The First Dedicated Konkani LLM

nischay185/konkani-qwen2-1.5b is a 1.5 billion parameter language model developed by Nischay Mandrekar, representing the first open-source LLM specifically trained for the Konkani language (Goan dialect, Devanagari script). It is built upon the Qwen2-1.5B base model and underwent a two-stage training process:

Training Details

Continued Pre-Training (CPT): Utilized 300,000 chunks of pure Konkani text.
Supervised Fine-Tuning (SFT): Trained on 13,417 curated Konkani instruction-response pairs across 12 categories, including chitchat, history, culture, food, poetry, and translation.

Key Capabilities

Konkani Language Generation: Holds conversations, writes poetry and short stories in pure Konkani.
Cultural Knowledge: Answers questions and describes aspects of Goan culture, history, festivals, and traditions.
Translation: Capable of translating English/Hindi to Konkani.

Good For

Applications requiring natural language understanding and generation in the Konkani language.
Educational tools focused on Konkani culture and language.
Creative writing and content generation in Konkani.

Limitations

Responses may occasionally truncate; adjusting max_new_tokens is recommended.
May hallucinate names in identity-related responses.
Best performance is observed with Goa/Konkani-specific topics.
Not suitable for tasks requiring real-time information.