lballore/llimba-3b-instruct
LLiMba-3B-Instruct by lballore is a 3.1 billion parameter instruction-tuned causal language model, extended from Qwen2.5-3B-Instruct, specifically designed for fluent Sardinian (LSC) communication. It retains the base model's multilingual capabilities while excelling in Sardinian conversation, translation, and text analysis. This model is optimized for low-resource language preservation and research, running efficiently on a single consumer GPU.
Loading preview...
LLiMba-3B-Instruct: Sardinian-Capable LLM
LLiMba-3B-Instruct is a 3.1 billion parameter instruction-tuned model developed by lballore, building upon Qwen2.5-3B-Instruct. Its primary distinction is its fluent Sardinian (LSC) capabilities, making it, to the developer's knowledge, the first openly released LLM capable of holding conversations, translating, and analyzing text in Sardinian. This model was adapted using continued pretraining and supervised fine-tuning, a process achievable on a single 24GB consumer GPU.
Key Capabilities
- Sardinian Language Proficiency: Excels in conversational Sardinian, translation to/from Sardinian, and text analysis, supporting LSC (Limba Sarda Comuna) and accepting Logudorese and Campidanese input.
- Multilingual Support: Retains the extensive multilingual coverage of its Qwen2.5 base model, including English, Chinese, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, and Arabic.
- Translation Performance: Achieves significant improvements in Sardinian translation, with BLEU scores up to 28.47 (EN-SC) and 41.28 (SC-EN) on the FLORES-200 dataset, compared to the base model's low single-digit scores.
- Efficient Training: The entire adaptation pipeline, including continued pretraining and supervised fine-tuning, was completed on a single NVIDIA RTX 4090 GPU.
Good For
- Sardinian Language Preservation: A valuable tool for research, education, and personal use by speakers and learners of Sardinian.
- Translation Tasks: Effective for translating between Sardinian and other Romance languages or English.
- Conversational AI: Suitable for conversational practice and factual recall on Sardinian topics.
- Low-Resource Language NLP: Serves as a strong starting point for further research in Sardinian natural language processing.