WenhaoWang/AutoT2VPrompt
WenhaoWang/AutoT2VPrompt is a 7-billion parameter language model, fine-tuned from Mistral-7B-v0.1 by Wenhao Wang, designed for automatic text-to-video prompt completion. It takes a few words as input and generates multiple full text-to-video prompts, leveraging the VidProM dataset. This model's primary use case is to assist users in generating detailed and varied prompts for text-to-video diffusion models, streamlining the creative process.
Loading preview...
AutoT2VPrompt: Automatic Text-to-Video Prompt Completion
AutoT2VPrompt is the first model specifically designed for automatically completing text-to-video prompts. Developed by Wenhao Wang and fine-tuned from Mistral-7B-v0.1, this 7-billion parameter model streamlines the process of generating detailed video prompts from minimal input.
Key Capabilities
- Automatic Prompt Generation: Given a short text input (e.g., "An underwater world"), the model expands it into multiple comprehensive text-to-video prompts.
- Diverse Outputs: Capable of generating a specified number of different prompts (e.g., 10) from the same input, offering users a variety of creative options.
- Customizable Generation: Parameters like
max_length,temperature,top_k, andnum_return_sequencescan be adjusted to control the length, randomness, and quantity of generated prompts.
Good for
- Content Creators: Quickly generating diverse and detailed prompts for text-to-video diffusion models.
- Video Production Workflows: Accelerating the ideation phase for video projects by providing multiple creative directions.
- Experimentation: Exploring various visual concepts for a given textual idea without extensive manual prompt engineering.
This model was fine-tuned on the VidProM dataset using 8 A100 GPUs, ensuring its specialization in text-to-video prompt generation.