FiLLM-POSDEPSUM: Filipino-Optimized NLP Model
FiLLM-POSDEPSUM is an 8.5 billion parameter model, part of the FiLLM (Filipino-optimized Large Language Model) family, developed by Isaiah Job Cuenca Enriquez, Carlos Jude Maminta, and Deandre Nigel Corpuz Nuñez. It is fine-tuned from the SeaLLM-7B 2.5 model, which itself is based on Gemma 7B, utilizing Low-Rank Adaptation (LoRA) for memory-efficient optimization. This model is specifically designed to address key Natural Language Processing (NLP) tasks in the Filipino language.
Key Capabilities
- Part-of-Speech (POS) Tagging: Identifies and labels the grammatical category of words in Filipino text.
- Dependency Parsing: Analyzes the grammatical structure of sentences by showing relationships between words.
- Text Summarization: Generates concise summaries of Filipino text.
Training and Evaluation
The model was trained and evaluated using diverse Filipino datasets to ensure robust performance across its specialized tasks. A related model, FiLLM-NER, handles Named Entity Recognition and is available separately. Users should be aware of potential hallucinations, especially if prompts do not end with a period. For more detailed information, refer to the associated research paper: FiLLM - A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM).
Good For
- Applications requiring grammatical analysis of Filipino text.
- Automated summarization of documents or articles in Filipino.
- Developing tools for Filipino language education or research.