THChou1220/gemma4-e4b-webvid4K_FT

VISIONConcurrency Cost:1Model Size:7.9BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:May 25, 2026Architecture:Transformer Cold

THChou1220/gemma4-e4b-webvid4K_FT is a 7.9 billion parameter model, a full fine-tune of Google's gemma-4-e4b-it architecture. This model is specifically optimized for processing and understanding AI-generated video data, having been fine-tuned on 3,941 video instruction examples derived from WebVid. It excels at tasks requiring comprehension of video content, making it suitable for applications involving video analysis and generation.

Loading preview...

Model Overview

THChou1220/gemma4-e4b-webvid4K_FT is a 7.9 billion parameter model, representing a full fine-tune of the google/gemma-4-e4b-it architecture. Its primary distinction lies in its specialized training on AI-generated video data, making it particularly adept at understanding and processing video-related instructions.

Key Training Details

This model underwent a comprehensive fine-tuning process, not utilizing LoRA, to adapt its capabilities specifically for video content. Key aspects of its training include:

  • Dataset: Trained on bear7011/gemma-4-e4b-webvid-4K, comprising 3,941 unique video instruction examples.
  • Precision: Training was conducted using bfloat16 precision.
  • Hardware: Utilized 4 GPUs with DeepSpeed ZeRO-3, including CPU optimizer and parameter offload.
  • Epochs & Steps: Trained for 1 epoch, completing 124 global steps.
  • Optimization: Employed an AdamW optimizer with a learning rate of 5e-6 and a cosine LR scheduler.
  • Context Length: Supports a maximum sequence length of 2304 tokens.

Primary Use Case

This model is specifically designed for applications that require advanced understanding and generation capabilities related to video data, particularly AI-generated content. Its fine-tuning on video instruction examples positions it as a strong candidate for tasks such as video summarization, content generation based on video prompts, or other video-centric AI applications.