Hi there! I am a postdoctoral fellow at UC Berkeley working with Prof. Masayoshi
Tomizuka, a distinguished member of National Academy of Engineering, and was a visiting
scholar at MIT working with Prof.Joshua Tenenbaum. Before that, I received my PhD from the University of Hong Kong advised by Prof. Ping Luo, and my B.S. from Renmin University of China under the supervision of Prof. Zhiwu Lu. My research interests lie at the intersection of embodied AI, robotics, and vision. I am open to research discussions and collaborations, feel free to get in touch!
myding at berkeley dot edu [Google Scholar]
I will be joining the Department of Computer Science at UNC-Chapel Hill as an assistant professor. I am actively seeking highly motivated students for PhD (Spring/Fall 2025) and intern positions. If you’re interested, please fill out this form before email me, thanks!
Research Highlights
My long-term research goal is to build embodied agents that can reason about and interact effectively with the physical world.
- Embodied AI: robot learning, LLM planner, physical simulation, commonsense reasoning
- General-purpose Models: vision-language foundation models, self-supervised learning, 3D vision
CoRL22: Embodied Concept Learner
NeurIPS23 Dataset: Physion++
NeurIPS23: EmbodiedGPT
ICML23: AdaptDiffuser
ICRA24: RT-X
ICLR24: Tree-Planner
Preprint: PhyGrasp
Preprint: Long-Horizon Tasks with LLM
CVPR23: EC for Embodied Control
NeurIPS22: ComPhy Benchmark
NeurIPS21: Reasoning with DiffPhysics
CVPR24: SkillDiffuser
Preprint: RoboScript
ICML24: RoboCodeX
Preprint: LanguageMPC
CVPR20: Depth-Guided 3D Det
ECCV22: DaViT
(* Equal contribution. † Corresponding author.)
- Human-oriented Representation Learning for Robotic Manipulation
- Mingxiao Huo, Mingyu Ding†, Chenfeng Xu, Thomas Tian, Xinghao Zhu, Yao Mu, Lingfeng Sun, Masayoshi Tomizuka, Wei Zhan
RSS 2024
- SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution
- Zhixuan Liang, Yao Mu, Hengbo Ma, Masayoshi Tomizuka, Mingyu Ding†, Ping Luo
CVPR 2024
[paper] [project]
- Open X-Embodiment: Robotic Learning Datasets and RT-X Models
- Open X-Embodiment Collaboration: Google, Mingyu Ding, et al.
ICRA 2024
(Best Paper Award) [paper] [code] [project]
- UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling
- Haoyu Lu, Yuqi Huo, Guoxing Yang, Zhiwu Lu, Wei Zhan, Masayoshi Tomizuka, Mingyu Ding†
ICLR 2024
[paper] [code]
- VDT: General-purpose Video Diffusion Transformers via Mask Modeling
- Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding†
ICLR 2024
[paper] [project]
- EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
- Yao Mu, Qinglong Zhang, Mengkang Hu, Wenhai Wang, Mingyu Ding†, Jun Jin, Bin Wang, Jifeng Dai, Yu Qiao, Ping Luo
NeurIPS 2023
(Spotlight) [paper] [project]
- Towards Free Data Selection with General-Purpose Models
- Yichen Xie, Mingyu Ding†, Masayoshi Tomizuka, Wei Zhan
NeurIPS 2023
[paper] [code]
- Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties
- Hsiao-Yu Tung*, Mingyu Ding*, Zhenfang Chen, Daniel M. Bear, Chuang Gan, Joshua B. Tenenbaum, Daniel L. K. Yamins, Judith Fan, Kevin A. Smith
NeurIPS dataset track 2023
[paper] [project]
- Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention
- Mingyu Ding, Yikang Shen, Lijie Fan, Zhenfang Chen, Zitian Chen, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
CVPR 2023
[paper] [code]
- Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following
- Mingyu Ding, Yan Xu, Zhenfang Chen, David Daniel Cox, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
CoRL 2022
[paper] [code] [project]
- DaViT: Dual Attention Vision Transformers
- Learning Versatile Neural Architectures by Propagating Network Codes
- Mingyu Ding, Yuqi Huo, Haoyu Lu, Linjie Yang, Zhe Wang, Zhiwu Lu, Jingdong Wang, Ping Luo
ICLR 2022
[paper] [code] [project]
- Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
- Mingyu Ding, Zhenfang Chen, Tao Du, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
NeurIPS 2021
[paper] [code] [project]
- HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
- Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, Ping Luo
CVPR 2021
(Oral) [paper] [code]
- Learning Depth-Guided Convolutions for Monocular 3D Object Detection
- Mingyu Ding, Yuqi Huo, Hongwei Yi, Zhe Wang, Jianping Shi, Zhiwu Lu, Ping Luo
CVPR 2020
[paper] [code]
- Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow
- Mingyu Ding, Zhe Wang, Bolei Zhou, Jianping Shi, Zhiwu Lu, Ping Luo
AAAI 2020
[paper]
- CamNet: Coarse-to-Fine Retrieval for Camera Re-localization
- Mingyu Ding, Zhe Wang, Jiankai Sun, Jianping Shi, Ping Luo
ICCV 2019
[paper] [code]
Selected Honors
- 2024 ICRA Best Paper Award
- 2023 ME Rising Stars, UC Berkeley (30 awardees worldwide in 2023)
- 2023 CVPR Doctoral Consortium (53 awardees worldwide in 2023)
- 2022 WAIC Rising Stars (15 awardees globally each year)
- 2021 Baidu Fellowship (10 awardees globally each year)
- 2020 Microsoft Fellowship Nomination Award (15 nominations in Asia)
- 2018 Best Student Paper Runner-up Award in 25th International Conference on Neural Information Processing (ICONIP)
- M. Braun Postgraduate Prize, University of Hong Kong
- Outstanding Graduate of Beijing
- National Scholarship
Activities
- Conference Reviewer for ICML, ICLR, NeurIPS, CVPR, CoRL, ICRA, IROS, ICCV, ECCV, AAAI, WACV, ACMMM, ACC, IV, etc.
- Journal Reviewer for TPAMI, TMLR, IJCV, TIP, TCSVT, TMM, TOMM, TITS, TIV, RA-L, ACM CSUR, Neurocomputing, etc.