ManiSkill-ViTac: Vision-based-Tactile Manipulation Skill Learning Challenge 2024


The ManiSkill-ViTac Challenge 2024 has closed. The winners are Winners. The award ceremony was held at the 5th ViTac workshop in ICRA 2024.

The challenge will be concluded at the 5th Annual Embodied AI workshop in CVPR 2024.

For the upcoming ManiSkill-ViTac Challenge 2025, you can subscribe to the mailing list: for news and notifications. You can also join the discord to contact us.


Vision-based tactile sensing has witnessed great progress in recent years. It can provide contact information which is important for manipulation skill learning. However, many research labs, particularly those lacking hardware expertise, face challenges in building physical experimental platforms and concentrating their research efforts on manipulation skill learning.

The ManiSkill-ViTac Challenge aims to provide a standardized benchmarking platform for evaluating the performance of vision-based-tactile manipulation skill learning in real-world robot applications. This challenge focuses specifically on assessing skill learning. Therefore, to enable fair comparison between approaches, the challenge uses a unified tactile sensor type and simulator across all participants.

To the best of our knowledge, it is the first open challenge in this field. We hope it can bring together world-wide researchers from tactile sensing and policy learning and help nurture the development of advanced manipulation skill learning frameworks based on tactile sensing.

The ManiSkill-ViTac Challenge features:

Getting Started

To get started submitting to the challenge, follow these steps:

1. Register a team to make submissions. Send an email to We recommend using education emails for registration. Teams shall have no more than one leader and one instructor. The maximum number of additional team members is four.

Email format:

                Team Name: [Team Name]
                Team Leader: [Name] [Institute] [Email]
                (Optional) Instructor: [Name] [Institute] [Email]
                Team Members:
                1. [Name] [Institue] [Email (optional)]
                2. [Name] [Institue] [Email (optional)]
                3. [Name] [Institue] [Email (optional)]
                4. [Name] [Institue] [Email (optional)]

2. The codebase for this challenge is hosted on GitHub. This repository contains information about the competition environments, and the submission format.

3. Checkout the leaderboard and make your own submission.


Join our discord to contact us. You may also email us at


The challenge includes two tasks: peg insertion and lock opening. Similar to the ManiSkill challenge, we focus on evaluating the policy’s generalizability. Each task includes multiple objects of different shapes (e.g., the peg can be a cuboid or a trapezoid, and keys can have two or three teeth), which are unknown during evaluation. Therefore, the policy needs to be able to generalize to different plugs and keys.

Plug insertion
Peg insertion
Lock opening
Lock opening

Challenge Format

The challenge is in two phases: Phase 1 Simulation, Phase 2 Real-world

Phase 1

The simulation environments are built based on our developed tactile sensor simulator, which uses FEM (Finite Element Method) for physics simulation and IPC (Incremental Potential Contact) as the contact model.

Phase 2

The top N participants’ algorithms will be evaluated on a physical robot. We use GelSight Mini, a commercially available vision-based tactile sensor, in our challenge. The input and output format of the real environment will be the same as the simulation.

Each team will have 3 chances to upload the codes and evaluate. Full evaluation logs, including raw observations, groundtruth states, will be given to the participants, such that they can fine-tune their algorithms.


Participating entries will be evaluated on K initial states. Each state will repeat T times. An episode is successful if the peg is in the hole and all the pins in the lock are lifted up. For the real-world robot, a torque sensor is used. If at any point the torque on the gripper is larger than a threshold, the task fails immediately. Then, we will have an array of success rates. We use the average rank of the K initial states as the final metric for comparing submissions.

For Phase1, the participants run the evaluation script on their own computer and submit the log files to the challenge committee.

We reserve the right to use additional metrics to choose winners in case of statistically insignificant differences.


1st Prize:
  • Team Illusion, Institute: GuangXi University
  • 2nd Prize:
  • Team Luban, Institute: Zhejiang University
  • Team TouchSight Innovators, Institute: Tongji University, King's College London
  • 3rd Prize:
  • Team SHT, Institute: ShanghaiTech University
  • Team Power *Star, Institute: Nanyang Technological University, I2R, ASTAR
  • Team GXU-ICMPE, Institute: GuangXi University
  • Team SSR-Tac, Institute: Tsinghua-Berkeley Shenzhen Institute, Tsinghua University
  • Schedule





    If you use our work in your research please cite the following

                  author={Chen, Weihang and Xu, Jing and Xiang, Fanbo and Yuan, Xiaodi and Su, Hao and Chen, Rui},
                  journal={IEEE Transactions on Robotics}, 
                  title={General-Purpose Sim2Real Protocol for Learning Contact-Rich Manipulation With Marker-Based Visuotactile Sensors}, 
    © ManiSkill-ViTac: Vision-based-Tactile Manipulation Skill Learning Challenge 2024