Autonomous driving: machines are learning to make sound decisions
The use of sensors to capture a vehicle’s surroundings is making it safer to drive cars. The overriding goal with autonomous driving is therefore to improve the abilities of artificial intelligence so it can make correct decisions on its own.
Driving a car is a cognitively challenging activity, especially in urban environments. Its difficulty increases with the amount of information drivers have to absorb, process, and actively apply. In order for technical systems to perform this demanding task, for example in self-driving cars, they need to emulate human abilities. This means detecting their surroundings and developing solutions and strategies on their own. To enable this, artificial cognitive systems are being implemented that require various artificial intelligence (AI) approaches, such as machine learning, neural networks, and deep learning. Particularly to ensure safety, the biggest challenge is to capture all the required information within a vehicle’s surroundings in real time, process it as quickly as possible, correctly interpret it, and then respond accordingly. It is therefore safe to assume we will initially see semiautonomous vehicles operating in less complex traffic situations, e.g. multi-lane highways.
Perceiving and understanding events and taking appropriate actions
One of the prerequisites for achieving autonomous driving is “machine perception” of a vehicle’s surroundings. This task is performed by cameras and sensors that scan the environment in real time. Artificial intelligence then interprets the data to identify an object at the edge of the road as a parked car, a cyclist, or a pedestrian, for instance, and make appropriate driving decisions based on this knowledge.
A method called deep learning is used to deduce patterns and objects from the enormous data volumes generated by the sensors. It deploys artificial neural networks, with numerous intermediate levels between the input and output layers, to create and interlink extensive internal structures. Adding more layers makes it possible to model increasingly complex situations. Deep learning algorithms also make it possible to include feedback and correction loops. These can be used to assign informational links different weights and enable the system to steadily improve its decision-making abilities on its own without human intervention.
Preliminary training with model-based data
The algorithm learns from each correct and incorrect decision—and therefore needs to be trained in advance. To develop a highly efficient solution for training artificial intelligence to perform these tasks, Fraunhofer IGD is taking a two-pronged approach. First, its researchers are applying models they have developed themselves, using either synthetically generated training data obtained directly from a CAD system or model-based data from a simulator.
Second, the team is capturing “objects” in an extended mode that includes not just reproducing each image’s position but also identifying the corresponding object’s location in the environment, in other words, its distance and viewing angle. This in turn makes it possible to estimate the trajectories of other vehicles, for example. Information of this kind is essential for planning a self-driving vehicle’s path.
Model-based data for rapid training
Employing model-based data as input has clear advantages over training with real data captured by cameras while driving on roads. The algorithms needed for model-based data can be quickly and inexpensively adjusted to generate a wide range of training scenarios. Although this approach is currently still in the development stage, it is already emerging as an efficient possibility towards enabling autonomous planning of trajectories.
Novel sensor technology
All the solutions are being tested and honed with an innovative sensor system resembling a neuromorphic hardware platform (a highly integrated system with circuits that mimic neuro-biological architectures). The scientists at Fraunhofer IGD are using data from event-based image processing sensors made by Prophesee Metavision. These “event cameras” are a biologically inspired alternative to video sensors, which are widely used but pose major challenges in terms of energy requirements, latency, dynamics, and image frequency, especially for mobile platforms with limited resources.
The advantages of event-based vision
In contrast to conventional image-based systems, each pixel of an event-based sensor works independently of the others. A pixel only turns on if it detects a change in the scene, such as a movement or “event.” As a result, on average these sensors need as little as one-thousandth as much data as conventional image-based models to generate a 3D image of a vehicle’s surroundings.
In addition to reducing data processing requirements, event-based sensors also improve the dynamic range, namely the ratio of the largest and smallest values. They can even distinguish very bright and very dark “objects” in extreme lighting conditions with contrasts as great as 120 dB, such as a person in dark clothing illuminated from behind by headlights. Event-based sensors also excel with low energy overhead and high time resolution. Thanks to their pixel independence and architecture, they attain an unprecedented level of energy efficiency, consuming only 3 nW per event and 26 mW per sensor. What’s more, the time resolution surpasses 10,000 individual images (frames) a second.