MeiGen-AI/GenEvolve
GenEvolve by MeiGen-AI is an 8 billion parameter agent policy, fine-tuned from Qwen3-VL-8B-Instruct, designed for self-evolving image generation. It orchestrates tools like web/image search and internal knowledge to produce prompt-reference programs for downstream image generators. This model excels at synthesizing complex visual requests by generating structured inputs for various reference-conditioned image generation models.
Loading preview...
GenEvolve: Self-Evolving Image Generation Agent
GenEvolve is an 8 billion parameter agent policy, built upon the Qwen3-VL-8B-Instruct backbone, specifically engineered for advanced image generation. Unlike traditional LLMs that directly generate images or text, GenEvolve acts as a sophisticated orchestrator, producing a (gen_prompt, reference_images) program that drives any reference-conditioned downstream image generator. This unique approach allows it to leverage external tools and internal knowledge for highly nuanced visual outputs.
Key Capabilities
- Tool-Orchestrated Trajectories: The agent intelligently calls tools such as
search,image_search, andquery_knowledge(8 distinct generation skills) to gather information before formulating the final image generation program. - Self-Evolution with Visual Experience Distillation: GenEvolve continuously improves through a self-evolution mechanism that distills best-vs-worst trajectory pairs into the deployed student policy, enhancing performance without requiring runtime memory at inference.
- Generator-Transferable: The same trained GenEvolve policy demonstrates robust performance across different image generators, including open-source options like Qwen-Image-Edit and proprietary models like Nano Banana Pro, showcasing its adaptability.
- Enhanced Knowledge Anchoring: Benchmarks like GenEvolve-Bench and WISE demonstrate GenEvolve's superior ability to anchor generated images to specific knowledge and maintain high quality compared to raw generators and other search-based agents.
Good For
- Research in Agentic Image Generation: Ideal for exploring tool-using image-generation agents, agentic prompt-program synthesis, and self-distillation techniques.
- Complex Visual Request Fulfillment: Suited for scenarios requiring detailed, knowledge-anchored image generation by leveraging external information and structured prompts.
- Driving Diverse Image Generators: Can be used as a front-end orchestrator for various reference-conditioned image generation models, providing consistent and high-quality inputs.