Building and Working in Environments for Embodied AI

A CVPR 2022 Tutorial, June 20

Introduction

In recent years, there has been a growing interest in embodied AI research in computer vision. Multiple embodied AI workshops and challenges have taken place in the research community, including Generalizable Policy Learning in the Physical World in ICLR 2022, OCRTOC: Open Cloud Robot Table Organization Challenge in IROS 2020, Habitat: Embodied Agents Challenge and Workshop in CVPR 2019, and Embodied AI Workshop in CVPR 2020 and 2021. Computer vision is now an essential module in embodied AI research, but we are still missing a basic tutorial to guide researchers, especially those from vision and machine learning backgrounds, to get started in this field.

In particular, many impressive progress in embodied AI has been made in virtual environments, which are powered by the latest progress in physical simulation and rendering technologies. These platforms allows for the study of many vision-robotics problems that cannot be studied at scale in the real world before. The nature of faster speed, easier parallelization, simpler data collection, and lower cost allows embodied AI study in simulation to build larger communities, with diverse researcher backgrounds, improved code sharing, and standard benchmarks. However, virtual environments do come with their own issues, such as simulation parameters and domain gaps, which are worth noting when building and using them.

Our tutorial aims to provide the getting-started guide for computer vision researchers to study vision problems on embodied agents in the environments, as well as highlight common issues encountered when using these environments. The tutorial will focus on the principles shared across platforms and teach concepts using multiple simulation environments.

Syllabus

The course will cover the following units:

Overview of Embodied AI

Embodied AI involves a wide range of topics. Here we provide a broad overview of the embodied AI field, including the following components.

The Basic Frameworks and techniques for Embodied AI

The details of how the visual system and the control and actuation system are connected together is often unclear to researchers in the vision community. Here we introduce common frameworks to compose a system with both components.

When vision researchers use embodied AI environments, they also need to have an basic knowledge of how the simulator works. This allows them to understand the capabilities and the limitations of the simulation, so that they can leverage the full capabilities of these environments and ensure correct simulation. We provide a summary of the key parameters, and guidance on how to debug issues independently.

Design Choices in Modern Embodied AI Environments

Building a embodied AI environment is much more than knowing the underlying simulation technologies. To study vision problems under useful setups and at proper abstraction level, we introduce the common design choices. We will also explain the choices in common embodied AI challenges so that audiences can quickly start working on them.

Experiences and Practices to Debug Simulators

Virtual environments are not perfect. Vision researchers new to them often face challenges using them correctly. Our team has rich experiences from the feedbacks of the SAPIEN (a simulator used by many and supports the ManiSkill Embodied AI Challenge) user community. We would share these experiences.

Real World Robotics and Sim2Real

Sim2Real is a very common question asked by users of simulation environments. In this section, we demonstrate how sim2real domain gaps arise in vision and robot control through case studies, and share our experience on deploying policies trained on simulators to the real world.

Embodied AI Tasks in ManiSkill and Visual Learning Challenges

Goal: we will summarize our findings through hosting the ManiSkill challenge, including

Material

SectionSlides Video
Overview of Embodied AI PDF   Google Slides YouTube
The Basic Frameworks and techniques for Embodied AI PDF   Google Slides YouTube
Design Choices in Embodied AI Environments PDF   Google Slides YouTube
Experience and Practices to Debug Simulators PDF   Google Slides YouTube
Real World Robotics and Sim2Real PDF   Google Slides YouTube
Embodied AI Tasks in ManiSkill and Visual Learning Challenges PDF   Google Slides YouTube
Code: https://github.com/haosulab/cvpr-tutorial-2022

Schedule

Start End Section Speaker
13:00 13:45 Overview of Embodied AIZhiwei Jia (video)
13:45 14:30 The Basic Frameworks and techniques for Embodied AIFanbo Xiang (in person)
14:30 15:15 Design Choices in Embodied AI EnvironmentsJiayuan Gu (video)
15:15 15:30 Break
15:30 16:15 Experience and Practices to Debug SimulatorsFanbo Xiang (in person)
16:15 16:35 Real World Robotics and Sim2RealRui Chen (video)
16:35 17:00 Embodied AI Tasks in ManiSkill and Visual Learning ChallengesFanbo Xiang (in person)

Organizers and Speakers

listed alphabetically

© 2022 Building and Working in Environments for Embodied AI