(* Equal contribution. † Corresponding author.)
- ReBot: Scaling Robot Learning with Real-to-Sim-to-Real Robotic Video Synthesis
- Fang Yu, Yue Yang, Xinghao Zhu, Kaiyuan Zheng, Gedas Bertasius, Daniel Szafir, Mingyu Ding†
IROS 2025
- P2 Explore: Efficient Exploration in Unknown Clustered Environment with Floor Plan Prediction
- Kun Song, Gaoming Chen, Masayoshi Tomizuka, Wei Zhan, Zhenhua Xiong, Mingyu Ding†
IROS 2025
- PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models
- Dingkun Guo, Yuqi Xiang, Shuqi Zhao, Xinghao Zhu, Masayoshi Tomizuka, Mingyu Ding†, Wei Zhan
IROS 2025
- Language-Driven Policy Distillation for Cooperative Multi-Agent Reinforcement Learning
- Jiaqi Liu, Chengkai Xu, Peng Hang, Jian Sun, Wei Zhan, Masayoshi Tomizuka, Mingyu Ding†
IROS & RA-L 2025
- BOSS: Benchmark for Observation Space Shift in Long-Horizon Task
- Yue Yang, Linfeng Zhao, Mingyu Ding, Gedas Bertasius, Daniel Szafir
RA-L 2025
- Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
- Yi Chen, Yuying Ge, Yizhuo Li, Yixiao Ge, Mingyu Ding, Ying Shan, Xihui Liu
ICCV 2025
(Oral)
- DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation
- Zhixuan Liang, Yao Mu, Yixiao Wang, Tianxing Chen, Wenqi Shao, Wei Zhan, Masayoshi Tomizuka, Ping Luo, Mingyu Ding†
CVPR 2025
- CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamical 3D Gaussians
- Chongjian Ge, Chenfeng Xu, Yuanfeng Ji, Chensheng Peng, Masayoshi Tomizuka, Ping Luo, Mingyu Ding†, Varun Jampani, Wei Zhan
CVPR 2025
- RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins
- Yao Mu, Tianxing Chen, Zanxin Chen, Shijia Peng, Zhiqian Lan, Zeyu Gao, Zhixuan Liang, Qiaojun Yu, Yude Zou, Mingkun Xu, Lunkai Lin, Zhiqiang Xie, Mingyu Ding, Ping Luo
CVPR 2025
(Highlight)
- WOMD-Reasoning: A Large-Scale Dataset for Interaction Reasoning in Driving
- Yiheng Li, Cunxin Fan, Chongjian Ge, Zhihao Zhao, Chenran Li, Chenfeng Xu, Huaxiu Yao, Masayoshi Tomizuka, Bolei Zhou, Chen Tang, Mingyu Ding†, Wei Zhan
ICML 2025
- Sel4FT: Annotation Selection for Pretraining-Finetuning With Distribution Shift
- Han Lu, Yichen Xie, Mingyu Ding, Wei Zhan, Xiaokang Yang, Masayoshi Tomizuka, Junchi Yan
TPAMI 2025
- Compositional physical reasoning of objects and events from videos
- Zhenfang Chen, Shilong Dong, Kexin Yi, Yunzhu Li, Mingyu Ding, Antonio Torralba, Joshua B Tenenbaum, Chuang Gan
TPAMI 2025
- A Survey of Reasoning with Foundation Models: Concepts, Methodologies, and Outlook
- Jiankai Sun, Chuanyang Zheng, Enze Xie, Zhengying Liu, Ruihang Chu, Jianing Qiu, Jiaqi Xu, Mingyu Ding, et al.
ACM CSUR 2025
- Physics-Aware Robotic Palletization with Online Masking Inference
- Tianqi Zhang, Zheng Wu, Yuxin Chen, Yixiao Wang, Boyuan Liang, Scott Moura, Masayoshi Tomizuka, Mingyu Ding†, Wei Zhan
ICRA 2025
(Best Paper Award in Automation)
- TrajSSL: Trajectory-Enhanced Semi-Supervised 3D Object Detection
- Philip Jacobson, Yichen Xie, Mingyu Ding, Chenfeng Xu, Masayoshi Tomizuka, Wei Zhan, Ming Wu
ICRA 2025
- Embodiment-Agnostic Action Planning via Object-Part Scene Flow
- Weiliang Tang, Jia-Hui Pan, Wei Zhan, Jianshu Zhou, Huaxiu Yao, Yun-Hui Liu, Masayoshi Tomizuka, Mingyu Ding†, Chi-Wing Fu
ICRA 2025
- X-Drive: Cross-modality Consistent Multi-Sensor Data Synthesis for Driving Scenarios
- Yichen Xie, Chenfeng Xu, Chensheng Peng, Shuqi Zhao, Nhat Ho, Alex T. Pham, Mingyu Ding†, Wei Zhan, Masayoshi Tomizuka
ICLR 2025
- MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large VLMs
- Peng Xia, Siwei Han, Shi Qiu, Yiyang Zhou, Zhaoyang Wang, Wenhao Zheng, Zhaorun Chen, Chenhang Cui, Mingyu Ding, Linjie Li, Lijuan Wang, Huaxiu Yao
ICLR 2025
(Oral)
- What can Foundation Models’ Embeddings do?
- Xueyan Zou, Linjie Li, Jianfeng Wang, Jianwei Yang, Mingyu Ding, Junyi Wei, Zhengyuan Yang, Feng Li, Hao Zhang, Shilong Liu, Arul Aravinthan, Yong Jae Lee, Lijuan Wang
NeurIPS 2024
- MoLE: Human-centric Text-to-image Diffusion with Mixture of Low-rank Experts
- Jie Zhu, Yixiong Chen, Mingyu Ding, Ping Luo, Leye Wang, Jingdong Wang
NeurIPS 2024
- Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning
- Yixiao Wang, Yifei Zhang, Mingxiao Huo, Thomas Tian, Xiang Zhang, Yichen Xie, Chenfeng Xu, Pengliang Ji, Wei Zhan, Mingyu Ding†, Masayoshi Tomizuka
CoRL 2024
- Q-SLAM: Quadric Representations for Monocular SLAM
- Chensheng Peng, Chenfeng Xu, Yue Wang, Mingyu Ding, Heng Yang, Masayoshi Tomizuka, Kurt Keutzer, Marco Pavone, Wei Zhan
CoRL 2024
- Human-oriented Representation Learning for Robotic Manipulation
- Mingxiao Huo, Mingyu Ding†, Chenfeng Xu, Thomas Tian, Xinghao Zhu, Yao Mu, Lingfeng Sun, Masayoshi Tomizuka, Wei Zhan
RSS 2024
- Pre-training on Synthetic Driving Data for Trajectory Prediction
- Yiheng Li, Zhihao Zhao, Chenfeng Xu, Chen Tang, Chenran Li, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan
IROS 2024
- Deep Sequence LiDAR Odometry Based on Inconsistent Spatio-temporal Propagation
- Huixin Zhang, Guangming Wang, Xinrui Wu, Chenfeng Xu, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan, Hesheng Wang
IROS 2024
(Oral)
- RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
- Yao Mu, Junting Chen, Qinglong Zhang, Shoufa Chen, et al., Jifeng Dai, Yu Qiao, Mingyu Ding†, Ping Luo
ICML 2024
- SkillDiffuser: Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution
- Zhixuan Liang, Yao Mu, Hengbo Ma, Masayoshi Tomizuka, Mingyu Ding†, Ping Luo
CVPR 2024
- Open X-Embodiment: Robotic Learning Datasets and RT-X Models
- Open X-Embodiment Collaboration: Google, Mingyu Ding, et al.
ICRA 2024
(Best Paper Award)
- Tree-Planner: Efficient Close-loop Task Planning with Large Language Models
- Mengkang Hu, Yao Mu, Chelsey Yu, Mingyu Ding†, Shiguang Wu, Wenqi Shao, Qiguang Chen, Bin Wang, Yu Qiao, Ping Luo
ICLR 2024
- UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling
- Haoyu Lu, Yuqi Huo, Guoxing Yang, Zhiwu Lu, Wei Zhan, Masayoshi Tomizuka, Mingyu Ding†
ICLR 2024
- VDT: General-purpose Video Diffusion Transformers via Mask Modeling
- Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding†
ICLR 2024
- DrPlanner: Diagnosis and Repair of Motion Planners for Automated Vehicles Using LLMs
- Yuanfei Lin, Chenran Li, Mingyu Ding, Masayoshi Tomizuka, Wei Zhan, Matthias Althoff
RA-L 2024
- RoadBEV: Road Surface Reconstruction in Bird’s Eye View
- Tong Zhao, Lei Yang, Yichen Xie, Mingyu Ding, Masayoshi Tomizuka, Yintao Wei
T-ITS 2024
- Towards interactive and cooperative driving automation: a LLM-driven decision-making framework
- Shiyu Fang, Jiaqi Liu, Mingyu Ding, Yiming Cui, Chen Lv, Peng Hang
T-VT 2024
- A Road Surface Reconstruction Dataset and Benchmark for Safe Autonomous Driving
- Tong Zhao, Chenfeng Xu, Mingyu Ding†, Masayoshi Tomizuka, Wei Zhan, Yintao Wei
Scientific Data 2024
- EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
- Yao Mu, Qinglong Zhang, Mengkang Hu, Wenhai Wang, Mingyu Ding†, Jun Jin, Bin Wang, Jifeng Dai, Yu Qiao, Ping Luo
NeurIPS 2023
(Spotlight)
- Towards Free Data Selection with General-Purpose Models
- Yichen Xie, Mingyu Ding†, Masayoshi Tomizuka, Wei Zhan
NeurIPS 2023
- Doubly-Robust Self-Training
- Banghua Zhu, Mingyu Ding, Philip Jacobson, Ming Wu, Wei Zhan, Michael Jordan, Jiantao Jiao
NeurIPS 2023
- Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties
- Hsiao-Yu Tung*, Mingyu Ding*, Zhenfang Chen, Daniel M. Bear, Chuang Gan, Joshua B. Tenenbaum, Daniel L. K. Yamins, Judith Fan, Kevin A. Smith
NeurIPS dataset track 2023 & CogSci/VSS 2023
- AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
- Zhixuan Liang, Yao Mu, Mingyu Ding, Fei Ni, Masayoshi Tomizuka, Ping Luo
ICML 2023
(Oral)
- Quadric Representations for LiDAR Odometry, Mapping and Localization
- Chao Xia, Chenfeng Xu, Patrick Rim, Mingyu Ding, Nanning Zheng, Kurt Keutzer, Masayoshi Tomizuka, Wei Zhan
RA-L 2023
- NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance Fields
- Jiankai Sun, Yan Xu, Mingyu Ding, Hongwei Yi, Jingdong Wang, Liangjun Zhang, Mac Schwager
ICRA & RA-L 2023
- TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
- Chengyang Zhao, Yikang Shen, Zhenfang Chen, Mingyu Ding, Chuang Gan
ICCV 2023
- Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention
- Mingyu Ding, Yikang Shen, Lijie Fan, Zhenfang Chen, Zitian Chen, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
CVPR 2023
- EC^2: Emergent Communication for Embodied Control
- Yao Mu, Shunyu Yao, Mingyu Ding, Ping Luo, Chuang Gan
CVPR 2023
- Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners
- Zitian Chen, Yikang Shen, Mingyu Ding, Zhenfang Chen, Hengshuang Zhao, Erik Learned-Miller, Chuang Gan
CVPR 2023
- Context Autoencoder for Self-Supervised Representation Learning
- Xiaokang Chen, Mingyu Ding, Xiaodi Wang, Ying Xin, Shentong Mo, Yunhao Wang, Shumin Han, Ping Luo, Gang Zeng, Jingdong Wang
IJCV 2023
- Understanding Self-Supervised Pretraining with Part-Aware Representation Learning
- Jie Zhu*, Jiyang Qi*, Mingyu Ding*, Xiaokang Chen, Ping Luo, Xinggang Wang, Wenyu Liu, Leye Wang, Jingdong Wang
TMLR 2023
- Planning with Large Language Models for Code Generation
- Shun Zhang, Zhenfang Chen, Yikang Shen, Mingyu Ding, Joshua B. Tenenbaum, Chuang Gan
ICLR 2023
- LGDN: Language-Guided Denoising Network for Video-Language Modeling
- Haoyu Lu, Mingyu Ding, Nanyi Fei, Yuqi Huo, Zhiwu Lu
NeurIPS 2022
(Spotlight)
- Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following
- Mingyu Ding, Yan Xu, Zhenfang Chen, David Daniel Cox, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
CoRL 2022
- DaViT: Dual Attention Vision Transformers
- Mingyu Ding, Bin Xiao, Noel Codella, Ping Luo, Jingdong Wang, Lu Yuan
ECCV 2022
- CtrlFormer: Learning Transferable State Representation for Visual Control via Transformer
- Yao Mu, Shoufa Chen, Mingyu Ding, Jianyu Chen, Runjian Chen, Ping Luo
ICML 2022
(Spotlight)
- ComPhy: Compositional Physical Reasoning of Objects and Events from Videos
- Zhenfang Chen, Kexin Yi, Yunzhu Li, Mingyu Ding, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan
ICLR 2022
- Learning Versatile Neural Architectures by Propagating Network Codes
- Mingyu Ding, Yuqi Huo, Haoyu Lu, Linjie Yang, Zhe Wang, Zhiwu Lu, Jingdong Wang, Ping Luo
ICLR 2022
- Compressed Video Contrastive Learning
- Yuqi Huo*, Mingyu Ding*, Haoyu Lu, Nanyi Fei, Zhiwu Lu, Ji-Rong Wen, Ping Luo
NeurIPS 2021
- Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
- Mingyu Ding, Zhenfang Chen, Tao Du, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
NeurIPS 2021
- Self-Supervised Video Representation Learning with Constrained Spatiotemporal Jigsaw
- Yuqi Huo, Mingyu Ding, Haoyu Lu, Ziyuan Huang, Mingqian Tang, Zhiwu Lu, Tao Xiang
IJCAI 2021
- HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
- Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, Ping Luo
CVPR 2021
(Oral)
- L2M-GAN: Learning to Manipulate Latent Space Semantics for Facial Attribute Editing
- Guoxing Yang, Nanyi Fei, Mingyu Ding, Guangzhen Liu, Zhiwu Lu, Tao Xiang
CVPR 2021
(Oral)
- PolarMask++: Enhanced Polar Representation for Single-Shot Instance Segmentation and Beyond
- Enze Xie, Wenhai Wang, Mingyu Ding, Ruimao Zhang, Ping Luo
TPAMI 2021
- IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning
- Manli Zhang, Jianhong Zhang, Zhiwu Lu, Tao Xiang, Mingyu Ding, Songfang Huang
ICLR 2021
- A Global Occlusion-Aware Approach to Self-Supervised Monocular Visual Odometry
- Yao Lu*, Xiaoli Xu*, Mingyu Ding*, Zhiwu Lu, Tao Xiang
AAAI 2021
- Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation
- Hongwei Yi, Zizhuang Wei, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai
ECCV 2020
- Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking
- Jianfeng Yan, Zizhuang Wei, Hongwei Yi, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai
ECCV 2020
(Spotlight)
- Segmenting Transparent Objects in the Wild
- Enze Xie, Wenjia Wang, Wenhai Wang, Mingyu Ding, Chunhua Shen, Ping Luo
ECCV 2020
- Learning Depth-Guided Convolutions for Monocular 3D Object Detection
- Mingyu Ding, Yuqi Huo, Hongwei Yi, Zhe Wang, Jianping Shi, Zhiwu Lu, Ping Luo
CVPR 2020
- SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D Vehicle Detection from Point Cloud
- Hongwei Yi, Shaoshuai Shi, Mingyu Ding, Jiankai Sun, Kui Xu, Zhe Wang, Sheng Li, Guoping Wang
ICRA 2020
- Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow
- Mingyu Ding, Zhe Wang, Bolei Zhou, Jianping Shi, Zhiwu Lu, Ping Luo
AAAI 2020
- CamNet: Coarse-to-Fine Retrieval for Camera Re-localization
- Mingyu Ding, Zhe Wang, Jiankai Sun, Jianping Shi, Ping Luo
ICCV 2019
- Face-Focused Cross-Stream Network for Deception Detection in Videos
- Mingyu Ding*, An Zhao*, Zhiwu Lu, Tao Xiang, Ji-Rong Wen
CVPR 2019
- Zero-Shot Learning with Superclasses
- Yuqi Huo, Mingyu Ding, An Zhao, Jun Hu, Ji-Rong Wen, Zhiwu Lu
ICONIP 2018
(Best Student Paper Runner-up)
- Domain-Invariant Projection Learning for Zero-Shot Recognition
- An Zhao*, Mingyu Ding*, Jiechao Guan*, Zhiwu Lu, Tao Xiang, Ji-Rong Wen
NeurIPS 2018
↩