jobenriquez/FiLLM-POSDEPSUM

Cold
Public
8.5B
FP8
8192
Hugging Face
Overview

FiLLM-POSDEPSUM: Filipino-Optimized NLP Model

FiLLM-POSDEPSUM is an 8.5 billion parameter model, part of the FiLLM (Filipino-optimized Large Language Model) family, developed by Isaiah Job Cuenca Enriquez, Carlos Jude Maminta, and Deandre Nigel Corpuz Nuñez. It is fine-tuned from the SeaLLM-7B 2.5 model, which itself is based on Gemma 7B, utilizing Low-Rank Adaptation (LoRA) for memory-efficient optimization. This model is specifically designed to address key Natural Language Processing (NLP) tasks in the Filipino language.

Key Capabilities

  • Part-of-Speech (POS) Tagging: Identifies and labels the grammatical category of words in Filipino text.
  • Dependency Parsing: Analyzes the grammatical structure of sentences by showing relationships between words.
  • Text Summarization: Generates concise summaries of Filipino text.

Training and Evaluation

The model was trained and evaluated using diverse Filipino datasets to ensure robust performance across its specialized tasks. A related model, FiLLM-NER, handles Named Entity Recognition and is available separately. Users should be aware of potential hallucinations, especially if prompts do not end with a period. For more detailed information, refer to the associated research paper: FiLLM - A Filipino-optimized Large Language Model based on Southeast Asia Large Language Model (SEALLM).

Good For

  • Applications requiring grammatical analysis of Filipino text.
  • Automated summarization of documents or articles in Filipino.
  • Developing tools for Filipino language education or research.