Robots have to face the real world, in which trying something might take seconds, hours, or even days. Unfortunately, the current state-of-the-art learning algorithms (e.g., deep learning) rely on the availability of very large data sets. In this talk, we explore approaches that tackle the challenge of learning by trial and error in a few minutes on physical robots (we call this challenge "micro-data reinforcement learning"). We will discuss methods that utilize the simulator to generate behavior repertoires in order to develop algorithms that allow complex robots to quickly recover from unknown circumstances (e.g., damages or different terrain) while completing their tasks and taking the environment into account. In particular, we will see how a physical damaged hexapod robot can recover most of its locomotion abilities in an environment with obstacles, and without any human intervention, using this type of algorithms. On a similar note, we will discuss how we can use different representations of policy functions to enable faster and more reliable behavior learning. In more detail, we will see how encoding the policy function as a trajectory can reduce the speed of learning in robotic manipulation tasks, while also giving us theoretical guarantees about the behavior. Next, we will discuss how model-based reinforcement learning (RL) algorithms can be adapted so that we can use them on real physical robots. In particular, we will discuss (1) methods that leverage multi-core CPUs to enable fast computational times, and (2) how we can "scale" model-based RL methods to high-dimensional robots. More concretely, we will showcase algorithms that are able to find high-performing walking policies for a physical damaged hexapod robot (48D state and 18D action space) in less than 1 minute of interaction time. Next, we will present methods that aim to incorporate learning methods inside traditional control architectures. In particular, we will present work towards incorporating data-driven methods into QP-based controllers and showcase how we can use these hybrid approaches to achieve robust tracking and performance on real-world robots and applications. Finally, we will discuss current work towards autonomous skill discovery and learning in robotics applications.
About the speaker
Dr. Konstantinos Chatzilygeroudis received the Integrated Master degree (Engineering Diploma) in computer science and engineering from the University of Patras, Patras, Greece, in 2014, and the Ph.D. degree in robotics and machine learning from Inria Nancy-Grand Est, France and the University of Lorraine, Nancy, France in 2018. From 2018 to 2020 he was a Postdoctoral Fellow with the LASA Team with the Swiss Federal Institute of Technology Lausanne (EPFL), Lausanne, Switzerland. He is a recipient of an H.F.R.I. Grant for Post-doctoral Fellows (2022-2024): he is the Principal Investigator of the project "Novel Optimization Methods for Autonomous Skill Learning in Robotics" that is being implemented within the Department of Mathematics, University of Patras, Greece. He has also taught and is still teaching several undergraduate and post-graduate courses on Artificial Intelligence, Computer Science and Robotics at University of Patras, Greece. He has also co-supervised several undergraduate and master theses. He is currently serving as an Associate Co-Chair of the IEEE Technical Committee on Model-based Optimization for Robotics, while he has served as an Associate Editor for several years at the International Conference on Intelligent Robotics (IROS) and actively participated in the organization committee (as a Chair responsible for the virtual part of the conference) of the International Conference on Robot Learning (CoRL) 2021. His work has been published in top-tier journals and conferences in the fields artificial intelligence, machine learning and robotics, and he has received a Best Paper Award at GECCO 2022. He has also actively collaborated with industrial partners: he was the Leader of the R&D Computer Vision Team at Metargus, a pre-seed funded startup (based in Patras, Greece), and he was the Lead Robotics Engineer at Ragdoll Dynamics (company based in London, UK). His research interests include the area of artificial intelligence and focus on reinforcement learning, fast robot adaptation, evolutionary computation and autonomous skill discovery.