Baris Serhan
Push-to-See: Learning Non-Prehensile Manipulation to Enhance Instance Segmentation via Deep Q-Learning
Serhan, Baris; Pandya, Harit; Kucukyilmaz, Ayse; Neumann, Gerhard
Authors
Harit Pandya
Dr AYSE KUCUKYILMAZ AYSE.KUCUKYILMAZ@NOTTINGHAM.AC.UK
Associate Professor
Gerhard Neumann
Abstract
Efficient robotic manipulation of objects for sorting and searching often rely upon how well the objects are perceived and the available grasp poses. The challenge arises when the objects are irregular, have similar visual features (e.g., textureless objects) and the scene is densely cluttered. In such cases, non-prehensile manipulation (e.g., pushing) can facilitate grasping or searching by improving object perception and singulating the objects from the clutter via physical interaction. The current robotics literature in interactive segmentation focuses solely on isolated cases, where the central aim is on searching or singulating a single target object, or segmenting sparsely cluttered scenes, mainly through matching visual futures in successive scenes before and after the robotic interaction. On the other hand, in this paper, we introduce the first interactive segmentation model in the literature that can autonomously enhance the instance segmentation of such challenging scenes as a whole via optimising a Q-value function that predicts appropriate pushing actions for singulation. We achieved this by training a deep reinforcement learning model with reward signals generated by a Mask-RCNN trained solely on depth images. We evaluated our model in experiments by comparing its success on segmentation quality with a heuristic baseline, as well as the state-of-the-art Visual Pushing and Grasping (VPG) model [1]. Our model significantly outperformed both baselines in all benchmark scenarios. Furthermore, decreasing the segmentation error inherently enabled the autonomous singulation of the scene as a whole. Our evaluation experiments also serve as a benchmark for interactive segmentation research.
Citation
Serhan, B., Pandya, H., Kucukyilmaz, A., & Neumann, G. (2022, May). Push-to-See: Learning Non-Prehensile Manipulation to Enhance Instance Segmentation via Deep Q-Learning. Presented at 2022 IEEE International Conference on Robotics and Automation (ICRA 2022), Philadelphia, USA
Presentation Conference Type | Edited Proceedings |
---|---|
Conference Name | 2022 IEEE International Conference on Robotics and Automation (ICRA 2022) |
Start Date | May 23, 2022 |
End Date | May 27, 2022 |
Acceptance Date | Jan 31, 2022 |
Online Publication Date | May 27, 2022 |
Publication Date | Jul 12, 2022 |
Deposit Date | Mar 3, 2022 |
Publicly Available Date | May 27, 2022 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Pages | 1513-1519 |
Book Title | 2022 IEEE International Conference on Robotics and Automation (ICRA 2022) |
ISBN | 9781728196824 |
DOI | https://doi.org/10.1109/ICRA46639.2022.9811645 |
Public URL | https://nottingham-repository.worktribe.com/output/7534992 |
Publisher URL | https://ieeexplore.ieee.org/document/9811645 |
Related Public URLs | https://www.icra2022.org/ |
Additional Information | © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. |
Files
ICRA22 1009 VI Fi
(14.2 Mb)
Video
2022-Serhan-ICRA22-PushToSee
(6.3 Mb)
PDF
You might also like
A Taxonomy of Domestic Robot Failure Outcomes: Understanding the impact of failure on trustworthiness of domestic robots
(2024)
Presentation / Conference Contribution
LABERT: A Combination of Local Aggregation and Self-Supervised Speech Representation Learning for Detecting Informative Hidden Units in Low-Resource ASR Systems
(2023)
Presentation / Conference Contribution
TAS for Cats: An Artist-led Exploration of Trustworthy Autonomous Systems for Companion Animals
(2023)
Presentation / Conference Contribution
Somabotics Toolkit for Rapid Prototyping Human-Robot Interaction Experiences using Wearable Haptics
(2023)
Presentation / Conference Contribution
Somabotics Toolkit for Rapid Prototyping Human-Robot Interaction Experiences using Wearable Haptics
(2023)
Presentation / Conference Contribution
Downloadable Citations
About Repository@Nottingham
Administrator e-mail: discovery-access-systems@nottingham.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search