Chenyang Qi, 戚晨洋

I am a Research Scientist at Google, working on the Veo project. Currently, I am focusing on real-time, interactive, high-resolution video generation.

Prior to Google, I was training diffusion-autoregressive model Hunyuan-image 3.0 as a Research Scientist in Tencent.

My research lies in Multimodal Generative AI, especially in image and Video Synthesis. I am interested in building AI systems that can simulate our dynamic visual world for creative control, and bring real-time interactive experiences to human beings.

I received my Ph.D. from the Hong Kong University of Science and Technology, supervised by Prof. Qifeng Chen. In 2020, I received my Bachelor's Degree in Automation from Zhejiang University with a National Scholarship from Chu Kochen Honors College.

Our team is recruiting talented interns on image and video generation!
Drop me an Email if you are interested in collaboration or internships!

Email  /  CV  /  Google Scholar  /  Github

profile photo

  • Feb 2026: Tea-adapter (video control) is accepted by CVPR 2026! See you in Denver!
  • Feb 2026: Follow-Your-Motion (motion transfer) is accepted by ICLR 2026!
  • Sep 2025: Hunyuan-image 3.0 is open-sourced.
  • Feb 2025: One paper about image editing and reasoning is accepted by CVPR 2025.
  • Oct 2024: I received my Ph.D. from HKUST!
  • Sep 2024: One paper on cross-domain image denoising is accepted by NeurIPS 2024.
  • June 2024: SPIRE is accepted by ECCV 2024.
  • July 2023: FateZero is accepted by ICCV 2023 as an Oral presentation.
  • Feb 2023: Two papers are accepted by CVPR 2023!

Research projects and products
Research Scientist, Google, Mountain View
2026 - Present
I am working on the Veo project, a high-resolution video generation model.
Research Scientist, Tencent
2024 - 2025
In diffusion-autoregressive model Hunyuan-image 3.0, I am responsible for semantic encoder, text-to-image pre-training and Identity-preserving instruction editing.
Publications

I am fortunate to collaborate with talented students and researchers around the world.
We work on multimodal generative models for image and video synthesis.
Hover your mouse over the image box below to view more results.

Instruction-based Image Editing with Planning, Reasoning, and Generation
Liya Ji, Chenyang Qi, Qifeng Chen

CVPR, 2025

SPIRE: Semantic Prompt-Driven Image Restoration

ECCV, 2024
project page

Adaptive Domain Learning for Cross-domain Image Denoising
Zian Qian, Chenyang Qi, Ka Lung Law, Hao Fu, Chenyang Lei, Qifeng Chen

NeurIPS, 2024

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing
Chenyang Qi, Xiaodong Cun , Yong Zhang,
Chenyang Lei, Xintao Wang , Ying Shan, Qifeng Chen
ICCV Oral, 2023
arxiv / code / project page

Editing your video via pretrained Stable Diffusion model without training.
(e.g., Replace the jeep with a posche car; Add Van Gogh style to the sunflower)

Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts
Yue Ma*, Yingqing He*, Hongfa Wang, Andong Wang, Chenyang Qi, Chengfei Cai, Xiu Li, Zhifeng Li, Heung-Yeung Shum, Wei Liu, Qifeng Chen,
arXiv, 2024
Project page / arXiv / Github

AnimateZero: Video Diffusion Models are Zero-Shot Image Animators
Jiwen Yu, Xiaodong Cun , Chenyang Qi , Yong Zhang, Xintao Wang, Ying Shan, Jian Zhang

ArXiv, 2023
MagicStick🪄: Controllable Video Editing via Control Handle Transformations
Yue Ma, Xiaodong Cun, Yingqing He, Chenyang Qi, Xintao Wang, Ying Shan, Xiu Li , Qifeng Chen

ArXiv, 2023
Inserting Anybody in Diffusion Models via Celeb Basis
Ge Yuan, Xiaodong Cun, Yong Zhang, Maomao Li, Chenyang Qi, Xintao Wang, Ying Shan, Huicheng Zheng
NeurIPS, 2023
project page / code

Face identity customization in diffusion model

Real-time 6K Image Rescaling with Rate-distortion Optimization
Chenyang Qi,* Xin Yang*, Ka Leong Cheng, Ying-Cong Chen, Qifeng Chen
CVPR, 2023
arxiv / code

Image upscaling with learnable frequency-domain quantization to achieve 6K real-time speed and best rate-distortion.

MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Bowen Zhang*, Chenyang Qi*, Pan Zhang, Bo Zhang,
HsiangTao Wu, Dong Chen, Qifeng Chen, Yong Wang, Fang Wen
CVPR, 2023
arxiv / code / project page

Identity-preserving talking head generation utilizing dense landmarks and
spatial-temporal enhancement with GAN priors.
(e.g., Make Marilyn Monroe speak as the motions of another person in the driving video)

Real-time Streaming Video Denoising with Bidirectional Buffers
Chenyang Qi*, Junming Chen*, Xin Yang, Qifeng Chen
ACM Multimedia, 2022
arxiv / code / project page

An extremely efficient (700X speedup) buffer-based framework for online video denoising.

Shape from Polarization for Complex Scenes in the Wild
Chenyang Lei*, Chenyang Qi*, Jiaxin Xie*, Na Fan, Vladlen Koltun , Qifeng Chen
CVPR, 2022
arxiv / code / project page

Scene-level normal estimation from a single polarization image using physics-based priors.

Internships
Research Intern, Adobe Research and FireFly, San Jose
Jan, 2024 - April, 2024
with Taesung Park, Jimei Yang and Eli Shechtman
Student Researcher, Google Research, Mountain View
July, 2023 - November, 2023
with Zhengzhong Tu, Keren Ye, Mauricio Delbracio, Hossein Talebi and Peyman Milanfar
Research Intern, Tencent AI Lab, Shenzhen
Jan, 2023 - June, 2023
with Xiaodong Cun, Yong Zhang, Xintao Wang, and Ying Shan
Research Intern, Microsoft Research Asia, Beijing
June, 2022 - December, 2022
with Bo Zhang, Dong Chen, and Fang Wen
Services
  • Reviewer: CVPR, ICCV, ECCV, NeurIPS, ICLR, IJCAI, ACM MM
Honors and Awards
  • Oral presentation in ICCV 2023
  • National Scholarship, 2018 (Chu Kochen Honors College, Zhejiang University)

Thanks Dr. Jon Barron for sharing the source code of his personal page.