WenhaoWang/AutoT2VPrompt

Warm
Public
7B
FP8
4096
License: cc-by-nc-4.0
Hugging Face
Overview

AutoT2VPrompt: Automatic Text-to-Video Prompt Completion

AutoT2VPrompt is the first model specifically designed for automatically completing text-to-video prompts. Developed by Wenhao Wang and fine-tuned from Mistral-7B-v0.1, this 7-billion parameter model streamlines the process of generating detailed video prompts from minimal input.

Key Capabilities

  • Automatic Prompt Generation: Given a short text input (e.g., "An underwater world"), the model expands it into multiple comprehensive text-to-video prompts.
  • Diverse Outputs: Capable of generating a specified number of different prompts (e.g., 10) from the same input, offering users a variety of creative options.
  • Customizable Generation: Parameters like max_length, temperature, top_k, and num_return_sequences can be adjusted to control the length, randomness, and quantity of generated prompts.

Good for

  • Content Creators: Quickly generating diverse and detailed prompts for text-to-video diffusion models.
  • Video Production Workflows: Accelerating the ideation phase for video projects by providing multiple creative directions.
  • Experimentation: Exploring various visual concepts for a given textual idea without extensive manual prompt engineering.

This model was fine-tuned on the VidProM dataset using 8 A100 GPUs, ensuring its specialization in text-to-video prompt generation.