Ruihao Xia - Personal Homepage

About Me

Ruihao Xia is a PhD student from the East China University of Science and Technology (ECUST) who specializes in the field of computer vision and deep learning. His research is centered on 3D AI Generation, Event-based Vision, Cross-Modality Domain Adaptation, and Semantic Segmentation. Xia received his B.S. in Mechanical Engineering also from ECUST, where he got excellent grades and developed a passion for Computer Vision.

Education

Visiting PhD Student in School of Computing and Information Systems

Singapore Management University (SMU)

2025.01 - Present (2025.10)

Pan Zhou

Research Interests

3D AI Generation

PhD in Control Science and Engineering

East China University of Science and Technology (ECUST)

2021 - Present (2026.09)

Yang Tang

Honors and Awards

2025.05 Principal Investigator of the Interdisciplinary Innovation and Education-Integration Project, Class IV Peak Discipline of “Intelligent Science and Technology” in Shanghai
2025.03 “Zhangjiang Shu Youbo” Ph.D. Cultivation Program in ECUST

B.S. in Mechanical Engineering

East China University of Science and Technology (ECUST)

2017 - 2021

Honors and Awards

2021 Shanghai Excellent Graduates
2020 National Undergraduate Smart Car Competition 2nd Prize
2019-2020 National Scholarship
2019 Shanghai Undergraduate Creative Robot Competition 2nd Prize
2018-2019 National Scholarship

Working Experience

Algorithm Research Intern

- Imaging Algorithm Research Department, Quality Enhancement Center

2024.05 - 2024.09

Conducted frontier research on image matting algorithms for mobile imaging. Focused on addressing the generalization limitations of interactive matting, proposed the COCO-Matting dataset and the SEMat framework. Related work has been submitted to IEEE TCSVT.

Algorithm Research Intern

- Central Research Institute, Advanced Computing and Storage Laboratory

2024.09 - 2024.12

Conducted frontier research on scene understanding algorithms based on event cameras. Focused on the underutilization of foundation models in event-based vision, proposed the TGVFM framework. Related work has been submitted to IEEE TCSVT.

Research & Publications (First Author)

Towards Scalable and Consistent 3D Editing

Ruihao Xia, Yang Tang*, Pan Zhou*

Under Review 2025

We introduce 3DEditVerse, the largest paired 3D editing benchmark, and propose 3DEditFormer, a mask-free transformer enabling precise, consistent, and scalable 3D edits.

Paper | Code | Project Page

3D Editing 3D Generation

Unsupervised Modality Adaptation with Text-to-Image Diffusion Models for Semantic Segmentation

Ruihao Xia, Yu Liang, Peng-Tao Jiang, Hao Zhang, Bo Li*, Yang Tang*, Pan Zhou

Neural Information Processing Systems (NeurIPS) 2024

We propose MADM, a diffusion-based framework that leverages text-to-image pre-trained models with pseudo-label stabilization and latent label regression, achieving SoTA semantic segmentation adaptation across image, depth, infrared, and events.

Paper | Code

Cross-Modality Domain Adaptation Semantic Segmentation

CMDA: Cross-Modality Domain Adaptation for Nighttime Semantic Segmentation

Ruihao Xia, Chaoqiang Zhao, Meng Zheng, Ziyan Wu, Qiyu Sun, Yang Tang*

International Conference on Computer Vision (ICCV) 2023

We propose CMDA, a cross-modality domain adaptation framework that leverages both images and events with daytime labels, introducing the first image-event nighttime segmentation dataset for evaluation.

Paper | Code

Domain Adaptation Semantic Segmentation Event-based Vision

Towards Natural Image Matting in the Wild via Real-Scenario Prior

Ruihao Xia, Yu Liang, Peng-Tao Jiang*, Hao Zhang, Qianru Sun, Yang Tang*, Bo Li, Pan Zhou

Submitted to IEEE TCSVT 2025

We introduce COCO-Matting and SEMat, a dataset-method pair that leverages real-world human mattes and a feature/matte-aligned transformer-decoder design with trimap-based regularization.

Paper | Code

Image Matting Image Segmentation

Temporal-Guided Visual Foundation Models for Event-Based Vision

Ruihao Xia, Junhong Cai, Luziwei Leng*, Ran Cheng, Yang Tang*, Pan Zhou

Submitted to IEEE TCSVT 2025

We present TGVFM, a temporal-guided framework that integrates pretrained Visual Foundation Models with novel spatiotemporal attention blocks, achieving SoTA gains in event-based semantic seg., depth estimation, and object detection.

Event-based Vision Visual Foundation Models

Modality Translation and Fusion for Event-based Semantic Segmentation

Ruihao Xia, Chaoqiang Zhao, Qiyu Sun, Shuang Cao, Yang Tang*

IFAC Control Engineering Practice (CEP) 2023

We propose MTF, a modality translation and fusion framework that distills complementary cross-modality knowledge from image-based teachers to event-based networks, achieving SoTA semantic segmentation in low-light conditions.

Paper

Event-based Vision Semantic Segmentation