Zhuo Xu

I'm a Research Scientist at Google DeepMind, where I work on robotics and foundation models. I received my Ph.D. from UC Berkeley, advised by Prof. Masayoshi Tomizuka, and B.E. from Tsinghua University.

Email     Google Scholar     LinkedIn     Twitter

News

[Feb 2025] I am serving as an Area Chair of CoRL 2025, consider submitting your work there this April.
[Dec 2024] I am organizing the 1st What Bimanuals Can Do (WBCD) Competition at ICRA 2025.
[Dec 2024] I am co-organizing the 2nd Earth Rover Challenge (ERC) Competition at ICRA 2025.
[Oct 2024] We successfully held the 1st Earth Rover Challenge (ERC) at IROS 2024, check out our report here
[July 2024] Mobility VLA was reported by top Tech Media like Medium, WIRED, TechCrunch, 36Kr (CN), etc.

Selected Publications

Autonomous Mario Kart in the Wild: Lessons Learned from The Earth Rover Challenge at IROS 2024

Xuesu Xiao, Jie Tan, Michael Cho, David Hsu, Dhruv Shah, Joanne Truong, Ted Xiao, Naoki Yokoyama, Wenhao Yu, Tingnan Zhang, Zhuo Xu, Santiago Pravisani, Niresh Dravin, Mohammad Alshamsi, Yeon-Kyu Lee, Jung-Tak Kim, Seung-Woo Seo, Joel Loo, Zishuo Wang, Nielsen Cugito, Yuwei Zeng, Tianle Shen, Arthur Zhang, Zichao Hu, Dongmyeong Lee, Taijing Chen, Michael Munje, Luisa Mao, Hochul Hwang, Peter Stone, Joydeep Biswas
Preprint

Imagined Potential Games: A Framework for Simulating, Learning and Evaluating Interactive Behaviors

Lingfeng Sun, Yixiao Wang, Pin-Yun Hung, Changhao Wang, Xiang Zhang, Zhuo Xu, Masayoshi Tomizuka
Preprint

Vision Language Models are In-Context Value Learners

Yecheng Jason Ma, Joey Hejna, Ayzaan Wahid, Chuyuan Fu, Dhruv Shah, Jacky Liang, Zhuo Xu, Sean Kirmani, Peng Xu, Danny Driess, Ted Xiao, Jonathan Tompson, Osbert Bastani, Dinesh Jayaraman, Wenhao Yu, Tingnan Zhang, Dorsa Sadigh, Fei Xia
International Conference on Learning Representations (ICLR) 2025

Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs

Hao-Tien Lewis Chiang*, Zhuo Xu*, Zipeng Fu*, Mithun George Jacob, Tingnan Zhang, Tsang-Wei Edward Lee, Wenhao Yu, Connor Schenck, David Rendleman, Dhruv Shah, Fei Xia, Jasmine Hsu, Jonathan Hoech, Pete Florence, Sean Kirmani, Sumeet Singh, Vikas Sindhwani, Carolina Parada*, Chelsea Finn*, Peng Xu*, Sergey Levine*, Jie Tan*
Conference on Robot Learning (CoRL) 2024

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

Soroush Nasiriany, Fei Xia, Wenhao Yu, Ted Xiao, Jacky Liang, Ishita Dasgupta, Annie Xie, Danny Driess, Ayzaan Wahid, Zhuo Xu, Quan Vuong, Tingnan Zhang, Tsang-Wei Edward Lee, Kuang-Huei Lee, Peng Xu, Sean Kirmani, Yuke Zhu, Andy Zeng, Karol Hausman, Nicolas Heess, Chelsea Finn, Sergey Levine, Brian Ichter
International Conference on Machine Learning (ICML) 2024

Spatial VLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

Boyuan Chen*, Zhuo Xu*, Sean Kirmani, Brian Ichter, Danny Driess, Pete Florence, Dorsa Sadigh, Leonidas Guibas, Fei Xia
Conference on Computer Vision and Pattern Recognition (CVPR) 2024

Multi-Agent Trajectory Generation with Diverse Contexts

Zhuo Xu, Rui Zhou, Yida Yin, Huidong Gao, Masayoshi Tomizuka, Jiachen Li
International Conference on Robotics and Automation (ICRA) 2024


Open x-embodiment: Robotic learning datasets and rt-x models

Open X-Embodiment Collaboration led by Google DeepMind
International Conference on Robotics and Automation (ICRA) 2024
★ Best Paper Award ★

AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

Michael Ahn, Debidatta Dwibedi, Chelsea Finn, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Karol Hausman, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Sean Kirmani, Isabel Leal, Edward Lee, Sergey Levine, Yao Lu, Isabel Leal, Sharath Maddineni, Kanishka Rao, Dorsa Sadigh, Pannag Sanketi, Pierre Sermanet, Quan Vuong, Stefan Welker, Fei Xia, Ted Xiao, Peng Xu, Steve Xu, Zhuo Xu
Google DeepMind Technical Report 2024


Distributed Multi-agent Interaction Generation with Imagined Potential Games

Lingfeng Sun, Pin-Yun Hung, Changhao Wang, Masayoshi Tomizuka, Zhuo Xu
Amerizan Control Conference (ACC) 2024


Generative Expressive Robot Behaviors using Large Language Models

Karthik Mahadevan, Jonathan Chien, Noah Brown, Zhuo Xu, Carolina Parada, Fei Xia, Andy Zeng, Leila Takayama, Dorsa Sadigh
International Conference on Human Robot Interaction (HRI) 2024
★ Best Paper Award ★

Rt-trajectory: Robotic task generalization via hindsight trajectory sketches

Jiayuan Gu, Sean Kirmani, Paul Wohlhart, Yao Lu, Montserrat Gonzalez Arenas, Kanishka Rao, Wenhao Yu, Chuyuan Fu, Keerthana Gopalakrishnan, Zhuo Xu, Priya Sundaresan, Peng Xu, Hao Su, Karol Hausman, Chelsea Finn, Quan Vuong, Ted Xiao
International Conference on Learning Representations (ICLR) 2024

Online Learning Based Mobile Robot Controller Adaptation for Slip Reduction

Huidong Gao, Rui Zhou, Masayoshi Tomizuka, Zhuo Xu
International Federation of Automatic Control (IFAC) World Congress 2023




Reinforcement learning based online parameter adaptation for model predictive tracking control under slippery condition

Huidong Gao, Rui Zhou, Masayoshi Tomizuka, Zhuo Xu
American Control Conference (ACC) 2022

Grouptron: Dynamic multi-scale graph convolutional networks for group-aware dense crowd trajectory forecasting

Rui Zhou, Hongyu Zhou, Huidong Gao, Masayoshi Tomizuka, Jiachen Li, Zhuo Xu
International Conference on Robotics and Automation (ICRA) 2022

Cocoi: contact-aware online context inference for generalizable non-planar pushing

Zhuo Xu, Wenhao Yu, Alexander Herzog, Wenlong Lu, Chuyuan Fu, Masayoshi Tomizuka, Yunfei Bai, C Karen Liu, Daniel Ho
International Conference on Intelligent Robots and Systems (IROS) 2021

Retinagan: An object-aware approach to sim-to-real transfer

Daniel Ho, Kanishka Rao, Zhuo Xu, Eric Jang, Mohi Khansari, Yunfei Bai
International Conference on Robotics and Automation (ICRA) 2021

Template credit to Andy and Jon.