Reinforcement learning for End-to-End autonomous driving

From ISLAB/CAISR
Title Reinforcement learning for End-to-End autonomous driving
Summary By using image data as input, and the vehicle steering and acceleration as output control signals, investigate how machine learning can control a vehicle.
Keywords Machine Learning, Deep Learning, Automated driving
TimeFrame December 2016 - June 2017
References [[References::[1] http://vdrift.net

[2] http://torcs.sourceforge.net [3] http://gym.openai.com [4] http://vizdoom.cs.put.edu.pl [5] http://www.arcadelearningenvironment.org]]

Prerequisites
Author
Supervisor Cristofer Englund, Alexey Voronov, Nicholas Wickström
Level Master
Status Open


Background As the development of autonomous vehicles becomes more and more common, the demand for knowledge in enabling technologies such as Machine learning grows. This project will develop knowledge within machine learning for End-to- End autonomous driving powered by reinforcement learning. Databases with real-traffic video data captured from on-board sensors are currently available (e.g. from Udacity and comma.ai). This video data, however, is fixed, and any control action from a controller that is different from the video is hard for the system to evaluate. Instead for using video data to train a system, we would like to investigate the use of a driving simulator, e.g. a racing simulator such as VDrift or TORCS [1, 2]. Captured video stream would be replaced by a rendered 3D image of the virtual environment (a.k.a. game screenshot). By using the image data as input, and the vehicle steering and acceleration as output control signals, it is possible to build a machine learning system that is capable to learn how to control the vehicle to stay on the road and keep the speed limits and possibly also learn traffic rules.

• What are the key features of Reinforcement Learning frameworks that allow rapid prototyping and repeatable comparisons of different learning algorithms? You will investigate the design and possibilities of Reinforce- ment Learning frameworks like OpenAI Gym, ViZDoom and The Arcade Learning Environment [3, 4, 5], and implement either a new framework or a new environment for an existing framework based on a vehicle simulator like VDrift, TORCS or the VICTA LAB simulator.

• Based on a literature survey, which of the existing machine learning tech- niques are suitable for vehicle control? Picking a few promising candi- dates, how good are they relative to each other under a detailed quan- titative comparison? In this part you will explore the potential of Rein- forcement Learning and/or other types of Machine Learning algorithms (semi-supervised learning, inverse reinforcement learning, apprenticeship learning etc.) for vehicle control.

• Can unsupervised learning algorithm come close to a hand-tuned spe- cialized algorithm? TORCS Racing Board accepts submissions of virtual “drivers” or robots, and training such a robot using several algorithms can be a way to answer it.

• How well a control policy learned in a virtual environment can be used in a similar but physical environment? Explore the possibility to use the Neural Network trained in a virtual environment to control e.g. a miniature vehicle.