Pattern & Recongnition Lab

Chenyang Si

Associate Professor
School of Intelligence Science and Technology
Nanjing University

Nanjing University Suzhou Campus, No. 1520 Taihu Avenue, Huqiu District, Suzhou, Jiangsu
50+
Papers
7K+
Citations
5+
Members
Chenyang Si

About

Chenyang Si is a Tenure-Track Associate Professor at the School of Intelligence Science and Technology, Nanjing University. Prior to this, he was a Research Fellow at Nanyang Technological University (NTU), Singapore, working with Prof. Ziwei Liu. Before that, he worked as a Research Scientist at the Sea AI Lab of Sea Group. He received his Ph.D. degree in 2021 from CASIA, supervised by Prof. Tieniu Tan, co-supervised by Prof. Liang Wang and Prof. Wei Wang.

His research interests span visual understanding and generation, including fundamental architectures for computer vision, video understanding, generative models, video and image generation, as well as acceleration and optimization of generative models.

Research Interests

Video Generation Diffusion Models Visual Understanding Efficient AI Image Editing & Morphing Evaluation Benchmarks
We are actively recruiting! Openings for 2027 Ph.D. / Master's students and undergraduate / graduate Research Assistants.
Join Us

Academic Services

Conference Area Chair
BMVC 2024 BMVC 2025 CVPR 2026
Conference Reviewer
CVPR ICCV NeurIPS ECCV AAAI ICLR
Journal Reviewer
TPAMI TIP TMM TCSVT IJCV

Research Directions

Video Generation

We study generative models for high-quality and controllable video synthesis, including diffusion-based video models, consistency models, and efficient video generation architectures.

Generative Models

We explore diffusion models and AIGC for image and multimodal generation, covering image editing, morphing, controllable generation, and 3D-aware content creation.

Visual Understanding

We develop fundamental vision architectures (e.g., MetaFormer) and methods for action recognition, skeleton-based understanding, and vision-language navigation.

Efficient AI

We investigate training-free and training-based acceleration methods for large generative models, reducing inference cost while maintaining generation quality.

Evaluation & Benchmarks

We design comprehensive evaluation benchmarks for video generative models (e.g., VBench) to enable systematic and fair assessment of generation quality.

Recent News

2025.07
DCM (Dual-Expert Consistency Model for Efficient Video Generation) and TACA (Cross-Modal Interaction in Multimodal Diffusion Transformers) are accepted by ICCV 2025. Congratulations!
2025.07
FreeMorph (Tuning-Free Generalized Image Morphing with Diffusion Model) is accepted by ICCV 2025. Congratulations!
2025.05
DaS (Diffusion as Shader: 3D-aware Video Diffusion) is accepted by SIGGRAPH 2025 Conference Track and selected in the SIGGRAPH Video Trailer!
2025.01
FasterCache (Training-Free Video Diffusion Model Acceleration) is accepted by ICLR 2025.
2025.01
RepVideo (Rethinking Cross-Layer Representation for Video Generation) is released on arXiv.
2024.09
MAN (Momentum Auxiliary Network for Supervised Local Learning) is accepted by ECCV 2024 as an Oral presentation. Congratulations!
2024.07
FreeInit, HPFF are accepted by ECCV 2024. Congratulations!
2024.02
FreeU is accepted by CVPR 2024 as an Oral presentation, and VBench is accepted as a Highlight. Congratulations!
2024.01
PRLab is established at the School of Intelligence Science and Technology, Nanjing University. Welcome to join us!

Lab Members

Chenyang Si

Chenyang Si

Principal Investigator · Associate Professor

Prof. Chenyang Si is a Tenure-Track Associate Professor at the School of Intelligence Science and Technology, Nanjing University. He was a Research Fellow at NTU and a Research Scientist at Sea AI Lab. He received his Ph.D. from CASIA in 2021. His research interests span visual understanding and generation, including video generation, diffusion models, and generative model acceleration.

Video Generation Diffusion Models Visual Understanding Efficient AI
Former Members

Publications

RepVideo
RepVideo: Rethinking Cross-Layer Representation for Video Generation
Chenyang Si, Weichen Fan, Zhengyao Lv, Ziqi Huang, Yu Qiao, Ziwei Liu
arXiv 2025
Vchitect-2.0
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Weichen Fan, Chenyang Si, Junhao Song, Zhenyu Yang, Yinan He, Long Zhuo, Ziqi Huang, Ziyue Dong, Jingwen He, Dongwei Pan, Yi Wang, Yuming Jiang, Yaohui Wang, Peng Gao, Xinyuan Chen, Hengjie Li, Dahua Lin, Yu Qiao, Ziwei Liu
arXiv 2025
GOOD
GOOD: Training-Free Guided Diffusion Sampling for Out-of-Distribution Detection
Xin Gao, Jiyao Liu, Guanghao Li, Yueming Lyu, Jianxiong Gao, Weichen Yu, Ningsheng Xu, Liang Wang, Caifeng Shan, Ziwei Liu, Chenyang Si
NeurIPS 2025
DCM
DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation
Zhengyao Lv, Chenyang Si, Tianlin Pan, Zhaoxi Chen, Kwan-Yee K. Wong, Yu Qiao, Ziwei Liu
ICCV 2025
TACA
TACA: Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Zhengyao Lv, Tianlin Pan, Chenyang Si, Zhaoxi Chen, Wangmeng Zuo, Ziwei Liu, Kwan-Yee K. Wong
ICCV 2025
FreeMorph
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
Yukang Cao, Chenyang Si, Jinghao Wang, Ziwei Liu
ICCV 2025
DaS
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control
Zekai Gu, Rui Yan, Jiahao Lu, Peng Li, Zhiyang Dou, Chenyang Si, Zhen Dong, Qifeng Liu, Cheng Lin, Ziwei Liu, Wenping Wang, Yuan Liu
SIGGRAPH 2025 Video Trailer
VBench++
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
Ziqi Huang, Fan Zhang, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu
arXiv 2025
FasterCache
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lv, Chenyang Si, Junhao Song, Zhenyu Yang, Yu Qiao, Ziwei Liu, Kwan-Yee K. Wong
ICLR 2025
FreeInit
FreeInit: Bridging Initialization Gap in Video Diffusion Models
Tianxing Wu, Chenyang Si, Yuming Jiang, Ziqi Huang, Ziwei Liu
ECCV 2024
MAN
Momentum Auxiliary Network for Supervised Local Learning
Junhao Su, Changpeng Cai, Feiyu Zhu, Chenghao He, Xiaojie Xu, Dongzhi Guan, Chenyang Si
ECCV 2024 Oral
HPFF
HPFF: Hierarchical Locally Supervised Learning with Patch Feature Fusion
Junhao Su, Chenghao He, Feiyu Zhu, Xiaojie Xu, Dongzhi Guan, Chenyang Si
ECCV 2024
FreeU
FreeU: Free Lunch in Diffusion U-Net
Chenyang Si, Ziqi Huang, Yuming Jiang, Ziwei Liu
CVPR 2024 Oral
VBench
VBench: Comprehensive Benchmark Suite for Video Generative Models
Ziqi Huang, Yinan He, Jiashuo Yu, Fan Zhang, Chenyang Si, Yuming Jiang, Yuanhan Zhang, Tianxing Wu, Qingyang Jin, Nattapol Chanpaisit, Yaohui Wang, Xinyuan Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu
CVPR 2024 Highlight
VideoBooth
VideoBooth: Diffusion-based Video Generation with Image Prompts
Yuming Jiang, Tianxing Wu, Shuai Yang, Chenyang Si, Dahua Lin, Yu Qiao, Chen Change Loy, Ziwei Liu
CVPR 2024
ROVI
Towards Language-Driven Video Inpainting via Multimodal Large Language Models
Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Ziwei Liu, Chen Change Loy
CVPR 2024
AugLocal
Scaling Supervised Local Learning with Augmented Auxiliary Networks
Chenxiang Ma, Jibin Wu, Chenyang Si, KC Tan
ICLR 2024
LaVie
LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
Yaohui Wang, Xinyuan Chen, Xin Ma, Shangchen Zhou, Ziqi Huang, Yi Wang, Ceyuan Yang, Yinan He, Jiashuo Yu, Peiqing Yang, Yuwei Guo, Tianxing Wu, Chenyang Si, Yuming Jiang, Cunjian Chen, Chen Change Loy, Bo Dai, Dahua Lin, Yu Qiao, Ziwei Liu
IJCV 2024
FDA
Frequency-Enhanced Data Augmentation for Vision-and-Language Navigation
Keji He, Chenyang Si, Zhihe Lu, Yan Huang, Liang Wang, Xinchao Wang
NeurIPS 2023
FSAR
FSAR: Federated Skeleton-based Action Recognition with Adaptive Topology Structure and Knowledge Distillation
Jingwen Guo, Hong Liu, Shitong Sun, Tianyu Guo, Min Zhang, Chenyang Si
ICCV 2023
SemanticPrompt
Semantic Prompt for Few-Shot Learning
Wentao Chen*, Chenyang Si*, Zhang Zhang, Liang Wang, Zilei Wang, Tieniu Tan (Equal contribution)
CVPR 2023
MetaFormer
MetaFormer Baselines for Vision
Weihao Yu, Chenyang Si, Pan Zhou, Mi Luo, Yichen Zhou, Jiashi Feng, Shuicheng Yan, Xinchao Wang
TPAMI 2023
FedZSL
Federated Zero-Shot Learning with Mid-Level Semantic Knowledge Transfer
Shitong Sun, Chenyang Si, Shaogang Gong, Guile Wu
Pattern Recognition 2023
iFormer
Inception Transformer
Chenyang Si, Weihao Yu, Pan Zhou, Yichen Zhou, Xinchao Wang, Shuicheng Yan
NeurIPS 2022 Oral
Mugs
Mugs: A Multi-Granular Self-Supervised Learning Framework
Pan Zhou, Yichen Zhou, Chenyang Si, Weihao Yu, Teck Khim Ng, Shuicheng Yan
NeurIPS Workshop 2022
CRRL
Contrast-Reconstruction Representation Learning for Self-supervised Skeleton-based Action Recognition
Peng Wang, Jun Wen, Chenyang Si, Yuntao Qian, Liang Wang
TIP 2022
MetaFormer
MetaFormer is Actually What You Need for Vision
Weihao Yu, Mi Luo, Pan Zhou, Chenyang Si, Yichen Zhou, Xinchao Wang, Jiashi Feng, Shuicheng Yan
CVPR 2022 Oral
BNTA
Generalizable Person Re-Identification via Self-Supervised Batch Norm Test-Time Adaption
Ke Han, Chenyang Si, Yan Huang, Liang Wang, Tieniu Tan
AAAI 2022
PDA
Few-Shot Learning with Part Discovery and Augmentation from Unlabeled Images
Wentao Chen, Chenyang Si, Wei Wang, Liang Wang, Zilei Wang, Tieniu Tan
IJCAI 2021
ASSL
Adversarial Self-Supervised Learning for Semi-Supervised 3D Action Recognition
Chenyang Si, Xuecheng Nie, Wei Wang, Liang Wang, Tieniu Tan, Jiashi Feng
ECCV 2020
HSPTSL
Skeleton-Based Action Recognition with Hierarchical Spatial Reasoning and Temporal Stack Learning Network
Chenyang Si, Ya Jing, Wei Wang, Liang Wang, Tieniu Tan
Pattern Recognition 2020
PMA
Pose-Guided Multi-Granularity Attention Network for Text-Based Person Search
Ya Jing, Chenyang Si, Junbo Wang, Wei Wang, Liang Wang, Tieniu Tan
AAAI 2020 Oral
AGCLSTM
An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition
Chenyang Si, Wentao Chen, Wei Wang, Liang Wang, Tieniu Tan
CVPR 2019
SRTSL
Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning
Chenyang Si, Ya Jing, Wei Wang, Liang Wang, Tieniu Tan
ECCV 2018
humanGen
Multistage Adversarial Losses for Pose-Based Human Image Synthesis
Chenyang Si, Wei Wang, Liang Wang, Tieniu Tan
CVPR 2018 Spotlight
PSNR
Pose-Based Two-Stream Relational Networks for Action Recognition in Videos
Wei Wang, Jinjin Zhang, Chenyang Si, Liang Wang
Tech Report 2018

Join Us

We are actively seeking highly motivated students and researchers to join PRLab at Nanjing University. Our lab focuses on cutting-edge research in visual understanding and generation, with a particular emphasis on video generation and diffusion models. If you are interested in applying, please refer to this Zhihu Post and fill out the Google Form.

Ph.D. Students
  • Recruiting for 2027 Fall (普博 / 直博)
  • Strong CS / AI background
  • Solid Python and PyTorch skills
  • Passion for computer vision research
  • Good English communication
Master's Students
  • Recruiting for 2027 Fall (保研 / 考研)
  • Undergraduate in CS or related fields
  • Strong math foundation
  • Deep learning framework experience
  • Collaborative and self-driven
Research Interns
  • Undergraduate / visiting students
  • At least 3–6 months commitment
  • Prior research experience preferred
  • Remote or on-site both welcome
  • Future Ph.D. applicants encouraged

Interested in joining PRLab?

Please send your CV and a brief research statement. We look forward to hearing from you.

chenyang.si@nju.edu.cn