đź§‘ About me
I am currently a first-year Ph.D. student at Nanyang Technological University, working under the supervision of Prof. Ziwei Wang in PINE Lab. I received the Master degree at Tsinghua University, where I was advised by Prof. Haoqian Wang. Prior to that, I received my B.Eng. degree in Electronic and Information Engineering from Tongji University.
My current research focuses on Embodied AI, with particular emphasis on world models and vision-language models (VLMs) for robotics.
github / google scholar / cv
🔥 News
- 2026.03: 🎉🎉 1 paper accepted to CVPR 2026 !!!
- 2024.12: 🎉🎉 1 paper accepted to AAAI 2025 !!!
đź’» Experience
- 09/2025~03/2026, Work at the ByteDance Seed Robotics.
- 05/2023~12/2024, I was a full-time intern at the Tencent Robotics X.
📝 Publications

UniPR: Unified Object-level Real-to-Sim Perception and Reconstruction from a Single Stereo Pair
Chuanrui Zhang*, Yingshuang Zou*, Zhengxian Wu, Yonggen Ling†✉, YuxiaoYang, Ziwei Wang✉
Website Paper Code
- We present UniPR, an end-to-end stereo framework that unifies 6D pose estimation and metric-scale 3D shape reconstruction, achieving up to 100Ă— faster generation and 3Ă— better shape-proportion accuracy for real-to-sim robotic manipulation.

LMGait: Language-Guided and Motion-Aware Gait Representation for Generalizable Recognition
Zhengxian Wu*, Chuanrui Zhang*, Shenao Jiang*, Hangrui Xu, Zirui Liao, Luyuan Zhang, Huaqiu Li, Peng Jiao, Haoqian Wang
Website Paper Code
- We present LMGait, a Language-Guided and Motion-Aware Gait Representation.

MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction
Yingshuang Zou*, Yikang Ding*, Chuantui Zhang, Xiaoyang Lyu, Feiyang Tan, Xiaojuan Qi, Haoqian Wangâ€
Website / Paper / Code
- We present MuDG, a controllable Multi-modal Diffusion model with Gaussian Splatting (GS) for Urban Scene Reconstruction.

TranSplat: Generalizable 3D Gaussian Splatting from Sparse Multi-View Images with Transformers
Chuanrui Zhang*, Yingshuang Zou*, Zhuoling Li, Minmin Yi, Haoqian Wangâ€
Website Paper Code
- We present TranSplat, a transformer-based approach for generalizable 3D gaussian splatting from sparse multi-view images.

DAGait: Generalized Skeleton-Guided Data Alignment for Gait Recognition
Zhengxian Wu*, Chuanrui Zhang*, Hangrui Xu, Peng Jiao, Haoqian Wang
Website Paper Code
- We present DAGait, a universal data alignment strategy for gait recognition, to alleviate spatiotemporal distribution inconsistencies.

Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images
Chuanrui Zhang* , Yonggen Ling*†, Minglei Lu, Minghan Qin, Haoqian Wangâ€
Website Paper Code
- We present CODERS, a one-stage approach for Category-level Object Detection, pose Estimation and Reconstruction from Stereo images.
đź“„ Preprint Papers

iMoWM: Taming Interactive Multi-Modal World Model for Robotic Manipulation
Chuanrui Zhang, Zhengxian Wu, Guanxing Lu, Yansong Tang, Ziwei Wang
Website Paper Code Coming
- We present iMoWM, an interactive multi-modal world model for robotic manipulation.

VoxelFormer: Bird’s-Eye-View Feature Generation based on Dual-view Attention for Multi-view 3D Object Detection
Zhuoling Li*, Chuanrui Zhang*, Wei-Chiu Ma, Yipin Zhou, Linyan Huang, Haoqian Wang†, SerNam Lim, Hengshuang Zhaoâ€
Paper Code
- We introduce Dual-View Attention, a more effecient cross attention for Multi-view 3D Object Detection.
🏆 Honors and Awards
- Scholarship, Tsinghua University, 2023.
- Excellent Graduates, Shanghai, 2022.
- Scholarship, Tongji University, 2019-2022.
- Nation Scholarship, 2018.
