Email: arpit15945 [at] gmail [dot] com

Professional Experiences

Robotics Engineer, Compute Acceleration Group
2024-Ongoing
Work on GPU-first Reinforcement Learning Algorithms and accelerated Machine Learning inference
Rendering Research Internship, Intel
Summer 2023
Worked on speeding up physics-based rendering algorithms using reinforcement learning
AI and Robotics R&D Engineer, Nvidia
2018-2019
Worked on Sim-to-Real and Reinforcement Learning for robotic manipulation and legged locomotion

Research Publications

In-Vivo Skin 3-D Surface Reconstruction and Wrinkle Depth Estimation using Handheld High Resolution Tactile Sensing
Advanced Healthcare Materials (Under Submission)
Abstract

Three-dimensional (3-D) skin surface reconstruction offers promise for objective and quantitative dermatological assessment, but no portable, high-resolution device exists that has been validated and used for depth reconstruction across various body locations. We present a compact 3-D skin reconstruction probe based on GelSight tactile imaging with a custom elastic gel and a learning-based reconstruction algorithm for micron-level wrinkle height estimation. Our probe, integrated into a handheld probe with force sensing for consistent contact, achieves a mean absolute error of 12.55 micron on wrinkle-like test objects. In a study with 15 participants without skin disorders, we provide the first validated wrinkle depth metrics across multiple body regions. We further demonstrate statistically significant reductions in wrinkle height at three locations following over-the-counter moisturizer application. Our work offers a validated tool for clinical and cosmetic skin analysis, with potential applications in diagnosis, treatment monitoring, and skincare efficacy evaluation.

PneuGelSight: Soft Robotic Vision-Based Proprioception and Tactile Sensing
International Journal of Robotics Research 2026
Abstract

Soft pneumatic robot manipulators are popular in industrial and human-interactive applications due to their compliance and flexibility. However, deploying them in real-world scenarios requires advanced sensing for tactile feedback and proprioception. Our work presents a novel vision-based approach for sensorizing soft robots. We demonstrate our approach on PneuGelSight, a pioneering pneumatic manipulator featuring high-resolution proprioception and tactile sensing via an embedded camera. To optimize the sensor's performance, we introduce a comprehensive pipeline that accurately simulates its optical and dynamic properties, facilitating a zero-shot knowledge transition from simulation to real-world applications. PneuGelSight and our sim-to-real pipeline provide a novel, easily implementable, and robust sensing methodology for soft robots, paving the way for the development of more advanced soft robots with enhanced sensory capabilities.

A modularized design approach for GelSight family of vision-based tactile sensors
International Journal of Robotics Research 2025
Abstract

GelSight family of vision-based tactile sensors has proven to be effective for multiple robot perception and manipulation tasks. These sensors are based on an internal optical system and an embedded camera to capture the deformation of the soft sensor surface, inferring the high-resolution geometry of the objects in contact. However, customizing the sensors for different robot hands requires a tedious trial-and-error process to redesign the optical system. In this paper, we formulate the GelSight sensor design process as a systematic and objective-driven design problem and perform the design optimization with a physically accurate optical simulation. The method is based on modularizing and parameterizing the sensor’s optical components and designing four generalizable objective functions to evaluate the sensor. We implement the method with an interactive and easy-to-use toolbox called OptiSense Studio. With the toolbox, non-sensor experts can quickly optimize their sensor design in both forward and inverse ways following our predefined modules and steps. We demonstrate our system with four different GelSight sensors by quickly optimizing their initial design in simulation and transferring it to the real sensors.

Vision-based tactile sensor design using physically based rendering
Nature Machine Intelligence Journal

Arpit Agarwal1, Achu Wilson1, Timothy Man1, Edward Adelson3, Ioannis Gkioulekas1 and Wenzhen Yuan2
Affiliations: 1 - Carnegie Mellon University, 2 - UIUC, 3 - MIT

Abstract

High-resolution tactile sensors are very helpful to robots for fine-grained perception and manipulation tasks, but designing those sensors is challenging. This is because the designs are based on the compact integration of multiple optical elements, and it is difficult to understand the correlation between the element arrangements and the sensor accuracy by trial and error. In this work, we introduce the first-ever digital design of vision-based tactile sensors using a physically accurate light simulator and machine intelligence. The framework modularizes the design process, parameterizes the sensor components, and contains an evaluation metric to quantify a sensor's performance. We quantify the effects of sensor shape, illumination setting, and sensing surface material on tactile sensor performance using our evaluation metric. For the first time, to our knowledge, the proposed optical simulation framework can replicate the tactile image of the real sensor prototype. Using our approach we can substantially improve the design of a fingertip GelSight sensor. This improved design performs approximately 5 times better than previous state-of-the-art human-expert design at real-world robotic tactile embossed text detection. Our simulation approach can be used with any vision-based tactile sensor to produce a physically accurate tactile image. Overall, our approach maps human intelligence to machine intelligence for the automatic design of sensorized soft robots and opens the door for tactile-driven dexterous manipulation.

Scalable, Simulation-Guided Compliant Tactile Finger Design
IEEE International Conference on Soft Robots (RoboSoft) 2024
Abstract

Compliant grippers enable robots to work with humans in unstructured environments. In general, these grippers can improve with tactile sensing to estimate the state of objects around them to precisely manipulate objects. However, co-designing compliant structures with high-resolution tactile sensing is a challenging task. We propose a simulation framework for the end-to-end forward design of GelSight Fin Ray sensors. Our simulation framework consists of mechanical simulation using the finite element method (FEM) and optical simulation including physically based rendering (PBR). To simulate the fluorescent paint used in these GelSight Fin Rays, we propose an efficient method that can be directly integrated in PBR. Using the simulation framework, we investigate design choices available in the compliant grippers, namely gel pad shapes, illumination conditions, Fin Ray gripper sizes, and Fin Ray stiffness. This infrastructure enables faster design and prototype time frames of new Fin Ray sensors that have various sensing areas, ranging from 48mm x 18mm to 70mm x 35mm. Given the parameters we choose, we can thus optimize different Fin Ray designs and show their utility in grasping day-to-day objects.

Robotic Defect Inspection with Visual and Tactile Perception for Large-scale Components
IEEE International Conference on Intelligent Robots and Systems (IROS) 2023
Abstract

In manufacturing processes, surface inspection is a key requirement for quality assessment and damage localization. Due to this, automated surface anomaly detection has become a promising area of research in various industrial inspection systems. A particular challenge in industries with large-scale components, like aircraft and heavy machinery, is inspecting large parts with very small defect dimensions. Moreover, these parts can be of curved shapes. To address this challenge, we present a 2-stage multi-modal inspection pipeline with visual and tactile sensing. Our approach combines the best of both visual and tactile sensing by identifying and localizing defects using a global view (vision) and using the localized area for tactile scanning for identifying remaining defects. To benchmark our approach, we propose a novel real-world dataset with multiple metallic defect types per image, collected in the production environments on real aerospace manufacturing parts, as well as online robot experiments in two environments. Our approach is able to identify 85% defects using Stage I and identify 100% defects after Stage II.

Simulation-driven vision-based tactile sensor design
IEEE International Conference on Computational Photography (ICCP) 2022
Abstract

We leverage physics-based rendering techniques to simulate the light transport process inside curved vision-based tactile sensor designs. We use physically-grounded models of light and material in our simulation. We also use lightweight and fast calibration methods for fitting the analytical models of light and material for our prototype. Given a calibrated simulation framework, we propose a tactile sensor shape optimization pipeline. Towards this goal, we propose a low-dimensional tactile sensor shape parameterization and automatically generate the full sensor prototype and an indenter surface virtually, which allows us to validate the sensor performance across the sensor surface. Our main technical results include a) accurately matching RGB image between simulation and a physical prototype b) generating improved tactile sensor shapes c) characterizing the design parameter space using appropriate light piping metrics. Our physically accurate simulation framework offers the ability to generate accurate RGB images for arbitrary vision-based tactile sensors. The parameter space exploration gives us high level guidelines on the design of tactile sensors for specific applications. Lastly, our system allows us to characterize the various tactile sensor designs in terms of their 3D shape reconstruction ability on different parts of the sensor surface.

Simulation of Vision-based Tactile Sensors using Physics based Rendering
IEEE International Conference on Robotics and Automation (ICRA) 2021
Abstract

Tactile sensing has seen a rapid adoption with the advent of vision-based tactile sensors. Vision-based tactile sensors provide high resolution, compact and inexpensive data to perform precise in-hand manipulation and human-robot interaction. However, the simulation of tactile sensors is still a challenge. In this paper, we built the first fully general optical tactile simulation system for a GelSight sensor using physics-based rendering techniques. We propose physically accurate light models and show in-depth analysis of individual components of our simulation pipeline. Our system outperforms previous simulation techniques qualitatively and quantitative on image similarity metrics.

Improving Grasp Stability with Rotation Measurement from Tactile Sensing
IEEE International Conference on Intelligent Robots and Systems (IROS) 2021
Abstract

Rotational displacement about the grasping point is a common grasp failure when an object is grasped at a location away from its center of gravity. Tactile sensors with soft surfaces, such as GelSight sensors, can detect the rotation patterns on the contacting surfaces when the object rotates. In this work, we propose a model-based algorithm that detects those rotational patterns and measures rotational displacement using the GelSight sensor. We also integrate the rotation detection feedback into a closed-loop regrasping framework, which detects the rotational failure of grasp in an early stage and drives the robot to a stable grasp pose. We validate our proposed rotation detection algorithm and grasp-regrasp system on self-collected dataset and online experiments to show how our approach accurately detects the rotation and increases grasp stability.

Model learning for look-ahead exploration in continuous control
AAAI Conference on Artificial Intelligence (AAAI), 2019
Abstract

We propose an exploration method that incorporates look- ahead search over basic learnt skills and their dynamics, and use it for reinforcement learning (RL) of manipulation poli- cies . Our skills are multi-goal policies learned in isolation in simpler environments using existing multigoal RL formula- tions, analogous to options or macroactions. Coarse skill dy- namics, i.e., the state transition caused by a (complete) skill execution, are learnt and are unrolled forward during looka- head search. Policy search benefits from temporal abstrac- tion during exploration, though itself operates over low-level primitive actions, and thus the resulting policies does not suf- fer from suboptimality and inflexibility caused by coarse skill chaining. We show that the proposed exploration strategy results in effective learning of complex manipulation poli- cies faster than current state-of-the-art RL methods, and con- verges to better policies than methods that use options or parametrized skills as building blocks of the policy itself, as opposed to guiding exploration. We show that the proposed exploration strategy results in effective learning of complex manipulation policies faster than current state-of-the-art RL methods, and converges to better policies than methods that use options or parameterized skills as building blocks of the policy itself, as opposed to guiding exploration.

Grasp Stability Prediction with Sim-to-Real Transfer from Tactile Sensing
IEEE International Conference on Intelligent Robots and Systems (IROS) 2022
Abstract

Robot simulation has been an essential tool for data-driven manipulation tasks. However, most existing simulation frameworks lack either efficient and accurate models of physical interactions with tactile sensors or realistic tactile simulation. This makes the sim-to-real transfer for tactile-based manipulation tasks still challenging. In this work, we integrate simulation of robot dynamics and vision-based tactile sensors by modeling the physics of contact. This contact model uses simulated contact forces at the robot's end-effector to inform the generation of realistic tactile outputs. To eliminate the sim-to-real transfer gap, we calibrate our physics simulator of robot dynamics, contact model, and tactile optical simulator with real-world data, and then we demonstrate the effectiveness of our system on a zero-shot sim-to-real grasp stability prediction task where we achieve an average accuracy of 90.7% on various objects. Experiments reveal the potential of applying our simulation framework to more complicated manipulation tasks. We open-source our simulation framework at https://github.com/CMURoboTouch/Taxim/tree/taxim-robot

Deep Reinforcement Learning with Skill Library: Exploring with Temporal Abstractions and coarse approximate Dynamics Models
Technical Report, CMU-RI-TR-18-31, July, 2018
Abstract

Reinforcement learning is a computational approach to learn from interaction. However, learning from scratch using reinforcement learning requires exorbitant number of interactions with the environment even for simple tasks. One way to alleviate the problem is to reuse previously learned skills as done by humans. This thesis provides frameworks and algorithms to build and reuse Skill Library. Firstly, we extend the Parameterized Action Space formulation using our Skill Library to multi-goal setting and show improvements in learning using hindsight at coarse level. Secondly, we use our Skill Library for exploring at a coarser level to learn the optimal policy for continuous control. We demonstrate the benefits, in terms of speed and accuracy, of the proposed approaches for a set of real world complex robotic manipulation tasks in which some state-of-the-art methods completely fail.

Reinforcement Learning of Active Vision for Manipulating Objects under Occlusions
Conference on Robotic Learning (CoRL), 2019
Abstract

We consider artificial agents that learn to jointly control their gripper and camera in order to reinforcement learn manipulation policies in the presence of occlusions from distractor objects. Distractors often occlude the object of in- terest and cause it to disappear from the field of view. We propose hand/eye con- trollers that learn to move the camera to keep the object within the field of view and visible, in coordination to manipulating it to achieve the desired goal, e.g., pushing it to a target location. We incorporate structural biases of object-centric attention within our actor-critic architectures, which our experiments suggest to be a key for good performance. Our results further highlight the importance of curriculum with regards to environment difficulty. The resulting active vision / manipulation policies outperform static camera setups for a variety of cluttered environments.

Other Projects

Position Free Monte Carlo for layered material rendering
Physics Based Rendering, CMU Spring 2021
Exploration with Expert Policy Advice
Technical Report, CMU 2018

Software

IES Light Profile Viewer