Name: THChou1220/gemma-4-e4b-kinetics54K_FFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: THChou1220

Model Overview

THChou1220/gemma-4-e4b-kinetics54K_FFT is a 7.9 billion parameter model based on the google/gemma-4-e4b-it architecture. This model has undergone a full fine-tuning process, rather than LoRA, specifically on a dataset of AI-generated video data. The training focused on enhancing its capabilities related to video understanding and instruction following.

Key Training Details

Dataset: Trained on bear7011/gemma-4-e4b-kinetics_54K, comprising 54,618 video instruction examples.
Methodology: Utilized full fine-tuning with bfloat16 precision over 1 epoch, achieving 1366 global steps.
Hardware & Optimization: Training was conducted on 4 GPUs, leveraging DeepSpeed ZeRO-3 with CPU optimizer and parameter offload for efficient resource management.
Configuration: Employed an AdamW optimizer with a learning rate of 5e-6 for both the main model and specific projector/image encoder components. Gradient checkpointing was enabled, and a maximum sequence length of 3072 was used.

Primary Use Case

This model is particularly well-suited for applications that require processing and responding to instructions related to video content. Its specialized training on video-derived data makes it a strong candidate for tasks such as video analysis, understanding actions within videos, or generating text based on video prompts.

Overview

Model Overview

Key Training Details

Primary Use Case

Full Model Card (README)