luispoveda93/Gala-4-E4B-it-preview

VISIONConcurrency Cost:1Model Size:7.9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 21, 2026Architecture:Transformer Cold

Gala-4-E4B-it-preview by luispoveda93 is a 4 billion parameter Gemma-4-E4B-it variant, specifically fine-tuned for Catalan language tasks. This model excels in Catalan question answering, natural language inference, and instruction following, leveraging extensive Catalan datasets like Projecte AINA and Nobel. Designed for efficiency, it offers competitive performance at significantly reduced computational cost compared to larger models, making it suitable for consumer hardware and edge deployments.

Loading preview...

Gala-4-E4B-it-preview: An Efficient Catalan LLM

Gala-4-E4B-it-preview is a 4 billion parameter language model developed by luispoveda93, based on Google's Gemma-4-E4B-it. It has been extensively fine-tuned on comprehensive Catalan datasets, including Projecte AINA and Nobel, using LoRA for efficient adaptation. The model demonstrates strong capabilities across various Catalan NLP tasks, including question answering, natural language inference, and instruction following.

Key Capabilities & Performance

  • Catalan Language Proficiency: Optimized for a wide range of Catalan NLP tasks, evaluated across 14 benchmarks including reasoning, commonsense, causality, NLI, grammar, and paraphrase.
  • Efficiency-First Design: Achieves an Overall NPM of 36.71, delivering approximately 70-80% of the performance of larger 7B models like Salamandra-7B, but at roughly half the compute cost.
  • Resource-Friendly: Designed for deployment on consumer hardware (e.g., RTX 3060 8GB), offering ~2x faster inference speed and ~50% cheaper cloud deployment costs compared to larger alternatives.
  • Robust Fine-tuning: Trained for 10 epochs on high-quality Catalan datasets, with a context window of 8,192 tokens.

Ideal Use Cases

  • Catalan-specific Applications: Excellent for question answering, instruction following, and text generation in Catalan.
  • Cost-Sensitive Deployments: Suited for scenarios requiring efficient inference on consumer GPUs, edge devices, or low-cost cloud infrastructure.
  • Educational & Research: Valuable for developing and experimenting with Catalan language models and NLP research.