guaran-ia/coreguapa-lm

TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Jun 10, 2026License:gpl-3.0Architecture:Transformer Open Weights Cold

guaran-ia/coreguapa-lm is a 9 billion parameter Gemma2-based causal language model developed by guaran-ia, fine-tuned on the restricted COREGUAPA corpus of high-quality Guarani text. Its primary purpose is to compute the perplexity score of Guarani documents, indicating text quality and similarity to the reference corpus. This model is specifically designed for Guarani text validation and is not intended for generative tasks.

Loading preview...

CoreguapaLM: Guarani Text Quality Validation Model

CoreguapaLM is a 9 billion parameter model based on the Gemma2ForCausalLM architecture, specifically fine-tuned by guaran-ia using the proprietary and high-quality COREGUAPA corpus. Unlike typical generative LLMs, its core function is to validate the quality of Guarani text by computing perplexity scores. A lower perplexity score suggests that the text is more predictable by the model and aligns more closely with the high-quality reference corpus it was trained on.

Key Capabilities

  • Guarani Text Perplexity Computation: Accurately assesses the predictability and quality of Guarani language documents.
  • Foundation in High-Quality Data: Trained exclusively on a manually curated, restricted corpus of high-quality Guarani materials.
  • Non-Generative Design: Explicitly developed for analytical tasks, not for generating new text.
  • Robust Architecture: Utilizes a Gemma2ForCausalLM base with 42 layers and a vocabulary size of 256,000, supporting a maximum context length of 8192 tokens.

Good For

  • Linguistic Quality Assurance: Ideal for researchers, linguists, or developers needing to evaluate the quality and authenticity of Guarani text.
  • Corpus Analysis: Useful for identifying Guarani text segments that align with a high-quality linguistic standard.
  • Specialized Guarani Applications: Suited for applications where text validation and adherence to a specific linguistic standard are critical, rather than creative generation.