(* Equal contribution. † Corresponding author.)
- PhyGrasp: Generalizing Robotic Grasping with Physics-informed Large Multimodal Models
- Dingkun Guo, Yuqi Xiang, Shuqi Zhao, Xinghao Zhu, Masayoshi Tomizuka, Mingyu Ding†, Wei Zhan
arXiv
- RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation
- Junting Chen, Yao Mu, Qiaojun Yu, Tianming Wei, et al., Yu Qiao, Huazhe Xu, Mingyu Ding†, Ping Luo
arXiv
- Generalizable Long-Horizon Manipulations with Large Language Models
- Haoyu Zhou, Mingyu Ding†, Weikun Peng, Masayoshi Tomizuka, Lin Shao, Chuang Gan
arXiv
- LanguageMPC: Large Language Models as Decision Makers for Autonomous Driving
- Hao Sha, Yao Mu, Yuxuan Jiang, Li Chen, Chenfeng Xu, Ping Luo, Eben Li, Masayoshi Tomizuka, Wei Zhan, Mingyu Ding†
arXiv
- Human-oriented Representation Learning for Robotic Manipulation
- Mingxiao Huo, Mingyu Ding†, Chenfeng Xu, Thomas Tian, Xinghao Zhu, Yao Mu, Lingfeng Sun, Masayoshi Tomizuka, Wei Zhan
RSS 2024
- RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis
- Yao Mu, Junting Chen, Qinglong Zhang, Shoufa Chen, et al., Jifeng Dai, Yu Qiao, Mingyu Ding†, Ping Luo
ICML 2024
- SkillDiffuser: Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution
- Zhixuan Liang, Yao Mu, Hengbo Ma, Masayoshi Tomizuka, Mingyu Ding†, Ping Luo
CVPR 2024
- Open X-Embodiment: Robotic Learning Datasets and RT-X Models
- Open X-Embodiment Collaboration: Google, Mingyu Ding, et al.
ICRA 2024
(Finalists for Best Conference Paper, Best Student Paper, and Best Manipulation Paper Awards)
- Tree-Planner: Efficient Close-loop Task Planning with Large Language Models
- Mengkang Hu, Yao Mu, Chelsey Yu, Mingyu Ding†, Shiguang Wu, Wenqi Shao, Qiguang Chen, Bin Wang, Yu Qiao, Ping Luo
ICLR 2024
- UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling
- Haoyu Lu, Yuqi Huo, Guoxing Yang, Zhiwu Lu, Wei Zhan, Masayoshi Tomizuka, Mingyu Ding†
ICLR 2024
- VDT: General-purpose Video Diffusion Transformers via Mask Modeling
- Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding†
ICLR 2024
- EmbodiedGPT: Vision-Language Pre-Training via Embodied Chain of Thought
- Yao Mu, Qinglong Zhang, Mengkang Hu, Wenhai Wang, Mingyu Ding†, Jun Jin, Bin Wang, Jifeng Dai, Yu Qiao, Ping Luo
NeurIPS 2023
(Spotlight)
- Towards Free Data Selection with General-Purpose Models
- Yichen Xie, Mingyu Ding†, Masayoshi Tomizuka, Wei Zhan
NeurIPS 2023
- Doubly-Robust Self-Training
- Banghua Zhu, Mingyu Ding, Philip Jacobson, Ming Wu, Wei Zhan, Michael Jordan, Jiantao Jiao
NeurIPS 2023
- Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties
- Hsiao-Yu Tung*, Mingyu Ding*, Zhenfang Chen, Daniel M. Bear, Chuang Gan, Joshua B. Tenenbaum, Daniel L. K. Yamins, Judith Fan, Kevin A. Smith
NeurIPS dataset track 2023 & CogSci/VSS 2023
- AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
- Zhixuan Liang, Yao Mu, Mingyu Ding, Fei Ni, Masayoshi Tomizuka, Ping Luo
ICML 2023
(Oral)
- Quadric Representations for LiDAR Odometry, Mapping and Localization
- Chao Xia, Chenfeng Xu, Patrick Rim, Mingyu Ding, Nanning Zheng, Kurt Keutzer, Masayoshi Tomizuka, Wei Zhan
RA-L 2023
- NeRF-Loc: Transformer-Based Object Localization Within Neural Radiance Fields
- Jiankai Sun, Yan Xu, Mingyu Ding, Hongwei Yi, Jingdong Wang, Liangjun Zhang, Mac Schwager
RA-L 2023
- TextPSG: Panoptic Scene Graph Generation from Textual Descriptions
- Chengyang Zhao, Yikang Shen, Zhenfang Chen, Mingyu Ding, Chuang Gan
ICCV 2023
- Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention
- Mingyu Ding, Yikang Shen, Lijie Fan, Zhenfang Chen, Zitian Chen, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
CVPR 2023
- EC^2: Emergent Communication for Embodied Control
- Yao Mu, Shunyu Yao, Mingyu Ding, Ping Luo, Chuang Gan
CVPR 2023
- Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners
- Zitian Chen, Yikang Shen, Mingyu Ding, Zhenfang Chen, Hengshuang Zhao, Erik Learned-Miller, Chuang Gan
CVPR 2023
- Context Autoencoder for Self-Supervised Representation Learning
- Xiaokang Chen, Mingyu Ding, Xiaodi Wang, Ying Xin, Shentong Mo, Yunhao Wang, Shumin Han, Ping Luo, Gang Zeng, Jingdong Wang
IJCV 2023
- Understanding Self-Supervised Pretraining with Part-Aware Representation Learning
- Jie Zhu*, Jiyang Qi*, Mingyu Ding*, Xiaokang Chen, Ping Luo, Xinggang Wang, Wenyu Liu, Leye Wang, Jingdong Wang
TMLR 2023
- Planning with Large Language Models for Code Generation
- Shun Zhang, Zhenfang Chen, Yikang Shen, Mingyu Ding, Joshua B. Tenenbaum, Chuang Gan
ICLR 2023
- LGDN: Language-Guided Denoising Network for Video-Language Modeling
- Haoyu Lu, Mingyu Ding, Nanyi Fei, Yuqi Huo, Zhiwu Lu
NeurIPS 2022
(Spotlight)
- Embodied Concept Learner: Self-supervised Learning of Concepts and Mapping through Instruction Following
- Mingyu Ding, Yan Xu, Zhenfang Chen, David Daniel Cox, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
CoRL 2022
- DaViT: Dual Attention Vision Transformers
- Mingyu Ding, Bin Xiao, Noel Codella, Ping Luo, Jingdong Wang, Lu Yuan
ECCV 2022
- CtrlFormer: Learning Transferable State Representation for Visual Control via Transformer
- Yao Mu, Shoufa Chen, Mingyu Ding, Jianyu Chen, Runjian Chen, Ping Luo
ICML 2022
(Spotlight)
- ComPhy: Compositional Physical Reasoning of Objects and Events from Videos
- Zhenfang Chen, Kexin Yi, Yunzhu Li, Mingyu Ding, Antonio Torralba, Joshua B. Tenenbaum, Chuang Gan
ICLR 2022
- Learning Versatile Neural Architectures by Propagating Network Codes
- Mingyu Ding, Yuqi Huo, Haoyu Lu, Linjie Yang, Zhe Wang, Zhiwu Lu, Jingdong Wang, Ping Luo
ICLR 2022
- Compressed Video Contrastive Learning
- Yuqi Huo*, Mingyu Ding*, Haoyu Lu, Nanyi Fei, Zhiwu Lu, Ji-Rong Wen, Ping Luo
NeurIPS 2021
- Dynamic Visual Reasoning by Learning Differentiable Physics Models from Video and Language
- Mingyu Ding, Zhenfang Chen, Tao Du, Ping Luo, Joshua B. Tenenbaum, Chuang Gan
NeurIPS 2021
- Self-Supervised Video Representation Learning with Constrained Spatiotemporal Jigsaw
- Yuqi Huo, Mingyu Ding, Haoyu Lu, Ziyuan Huang, Mingqian Tang, Zhiwu Lu, Tao Xiang
IJCAI 2021
- HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers
- Mingyu Ding, Xiaochen Lian, Linjie Yang, Peng Wang, Xiaojie Jin, Zhiwu Lu, Ping Luo
CVPR 2021
(Oral)
- L2M-GAN: Learning to Manipulate Latent Space Semantics for Facial Attribute Editing
- Guoxing Yang, Nanyi Fei, Mingyu Ding, Guangzhen Liu, Zhiwu Lu, Tao Xiang
CVPR 2021
(Oral)
- PolarMask++: Enhanced Polar Representation for Single-Shot Instance Segmentation and Beyond
- Enze Xie, Wenhai Wang, Mingyu Ding, Ruimao Zhang, Ping Luo
TPAMI 2021
- IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning
- Manli Zhang, Jianhong Zhang, Zhiwu Lu, Tao Xiang, Mingyu Ding, Songfang Huang
ICLR 2021
- A Global Occlusion-Aware Approach to Self-Supervised Monocular Visual Odometry
- Yao Lu*, Xiaoli Xu*, Mingyu Ding*, Zhiwu Lu, Tao Xiang
AAAI 2021
- Pyramid Multi-view Stereo Net with Self-adaptive View Aggregation
- Hongwei Yi, Zizhuang Wei, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai
ECCV 2020
- Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking
- Jianfeng Yan, Zizhuang Wei, Hongwei Yi, Mingyu Ding, Runze Zhang, Yisong Chen, Guoping Wang, Yu-Wing Tai
ECCV 2020
(Spotlight)
- Segmenting Transparent Objects in the Wild
- Enze Xie, Wenjia Wang, Wenhai Wang, Mingyu Ding, Chunhua Shen, Ping Luo
ECCV 2020
- Learning Depth-Guided Convolutions for Monocular 3D Object Detection
- Mingyu Ding, Yuqi Huo, Hongwei Yi, Zhe Wang, Jianping Shi, Zhiwu Lu, Ping Luo
CVPR 2020
- SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D Vehicle Detection from Point Cloud
- Hongwei Yi, Shaoshuai Shi, Mingyu Ding, Jiankai Sun, Kui Xu, Zhe Wang, Sheng Li, Guoping Wang
ICRA 2020
- Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow
- Mingyu Ding, Zhe Wang, Bolei Zhou, Jianping Shi, Zhiwu Lu, Ping Luo
AAAI 2020
- CamNet: Coarse-to-Fine Retrieval for Camera Re-localization
- Mingyu Ding, Zhe Wang, Jiankai Sun, Jianping Shi, Ping Luo
ICCV 2019
- Face-Focused Cross-Stream Network for Deception Detection in Videos
- Mingyu Ding*, An Zhao*, Zhiwu Lu, Tao Xiang, Ji-Rong Wen
CVPR 2019
- Zero-Shot Learning with Superclasses
- Yuqi Huo, Mingyu Ding, An Zhao, Jun Hu, Ji-Rong Wen, Zhiwu Lu
ICONIP 2018
(Best Student Paper Runner-up)
- Domain-Invariant Projection Learning for Zero-Shot Recognition
- An Zhao*, Mingyu Ding*, Jiechao Guan*, Zhiwu Lu, Tao Xiang, Ji-Rong Wen
NeurIPS 2018
↩