cuong1692001/gemma-3-4b-it_low

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Mar 17, 2026License:otherArchitecture:Transformer Cold

The cuong1692001/gemma-3-4b-it_low model is a 4.3 billion parameter instruction-tuned variant of Google's Gemma-3-4b-it, fine-tuned by cuong1692001. This model is specifically adapted from its base version using a custom dataset named 'gemma-3-4b-it_low'. It is designed for general language understanding and generation tasks, leveraging the Gemma architecture's capabilities.

Loading preview...

Model Overview

This model, gemma-3-4b-it_low, is a fine-tuned iteration of the google/gemma-3-4b-it base model, developed by cuong1692001. It utilizes the Gemma architecture, a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models.

Key Characteristics

  • Base Model: Google's Gemma-3-4b-it, a 4.3 billion parameter instruction-tuned model.
  • Fine-tuning: Adapted using a specific dataset named gemma-3-4b-it_low.
  • Training Hyperparameters: Trained with a learning rate of 1.25e-06, a batch size of 1, and 5 epochs, utilizing a cosine learning rate scheduler.

Intended Use Cases

Given its instruction-tuned nature and fine-tuning on a specific dataset, this model is likely suitable for:

  • General-purpose conversational AI.
  • Text generation and summarization tasks.
  • Instruction following in various language-based applications.

Limitations

As with many fine-tuned models, its performance and specific capabilities are highly dependent on the gemma-3-4b-it_low dataset used for training. Users should evaluate its performance on their specific tasks to understand its strengths and potential limitations.