Please use this identifier to cite or link to this item: https://doi.org/10.21256/zhaw-20419
Publication type: Conference paper
Type of review: Peer review (publication)
Title: Combining reinforcement learning with supervised deep learning for neural active scene understanding
Authors: Roost, Dano
Meier, Ralph
Toffetti Carughi, Giovanni
Stadelmann, Thilo
et. al: No
DOI: 10.21256/zhaw-20419
Conference details: Active Vision and Perception in Human(-Robot) Collaboration Workshop at IEEE RO-MAN 2020 (AVHRC’20), online, 31 August - 4 September 2020
Issue Date: 31-Aug-2020
Publisher / Ed. Institution: University of Essex
Language: English
Subjects: Active Vision; Deep Learning; Reinforcement Learning; Neural Scene Understanding; Robotic Grasping; Computer Vision
Subject (DDC): 004: Computer science
Abstract: While vision in living beings is an active process where image acquisition and classification are intertwined to gradually refine perception, much of today’s computer vision is build on the inferior paradigm of episodic classification of i.i.d. samples. We aim at improved scene understanding for robots by taking the sequential nature of seeing over time into account. We present a supervised multi-task approach to answer questions about different aspects of a scene such as the relationship between objects, their quantity or the their relative positions to the camera. For each question, we train a different output head which operates on input from one shared recurrent convolutional neural network that accumulates information over time steps. In parallel, we train an additional output head using reinforcement learning (RL) that uses the reduction in cumulative loss from the supervised heads as reward signal. It thereby learns to gradually improve the prediction confidence of e.g. partially occluded objects by moving the camera to a more favourable angle with respect to these objects. We present preliminary results on simulated RGB-D image sequences that show superior performance of our RL-based approach in answering questions quicker and more accurately than using static or random camera movement.
Further description: Awarded with the Dr. Waldemar Jucker award 2020 of the GST
URI: https://digitalcollection.zhaw.ch/handle/11475/20419
Fulltext version: Accepted version
License (according to publishing contract): Licence according to publishing contract
Departement: School of Engineering
Organisational Unit: Institute of Applied Information Technology (InIT)
Appears in Collections:Publikationen School of Engineering

Files in This Item:
File Description SizeFormat 
2020_Roost_Combining_reinforcement_learning_with_supervised_deep_learning.pdfAccepted Version1.52 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.