Name: davron04/gemma-3-270m-uzen-base API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: davron04

Model Overview

davron04/gemma-3-270m-uzen-base is a fine-tuned variant of the Gemma-3 270M base model, developed by davron04. This model has undergone further training on an unspecified dataset, resulting in a reported loss of 2.1987 and a perplexity of 9.0416 on its evaluation set. The training process utilized a learning rate of 2e-05, a total batch size of 256, and was conducted for 1 epoch using mixed-precision training.

Key Training Details

Base Model: Gemma-3 270M
Learning Rate: 2e-05
Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
LR Scheduler: Inverse square root with 0.01 warmup steps
Epochs: 1
Batch Size: 2 (per device), 256 (total effective batch size)
Mixed Precision: Native AMP enabled

Performance Metrics

During training, the model achieved a final validation loss of 2.1987 and a perplexity of 9.0416. These metrics reflect its performance as a language model on the evaluation data.

Intended Uses & Limitations

Specific intended uses and limitations are not detailed in the provided information. However, as a fine-tuned base model, it is generally suitable for tasks requiring language understanding and generation, and can serve as a strong foundation for further domain-specific adaptation or research.

Overview

Model Overview

Key Training Details

Performance Metrics

Intended Uses & Limitations

Full Model Card (README)