Insight

A Look at Autonomous Vehicle Perception

Perception of the surroundings is a core building block for autonomous driving. It provides answers to important questions such as: “Where is it safe to drive?”, “What are other road users doing?” and “Where are my blind spots?”.

In practice, perception is implemented by collecting data from various sensors, such as lidars, radars, cameras and ultrasound sensors, and processing these observations in the autonomous vehicle’s compute unit. The results of these computations then allow the AI to drive the vehicle safely and efficiently.

Let’s dig a bit deeper into the three questions mentioned above, to look at how these perception methods work on a technical level.

Where Is it Safe to Drive?

In order to drive safely, it’s natural to start by considering where the vehicle could drive in principle. This problem is called free space analysis, as we are trying to find spaces with flat ground that are free of obstacles. The result of this step is a segmentation of the space around the vehicle into regions where the vehicle can drive, and where it can not.

It is next logical to consider, out of all the locations where the vehicle could drive, where is it expected to drive? To answer this question, we can employ a method called lane detection to estimate where the road lanes are, and which ones the vehicle can use according to the local traffic rules. For example in the UK and Japan the vehicles should use the leftmost lanes. To solve the lane detection problem, we can use computer vision algorithms, as well as high resolution maps of the environment.

It is also important for the AI to consider what the condition of the road is, before driving on it. For example, water or ice on the surface of the road could significantly reduce the friction, limiting the vehicle’s ability to steer and brake, which could lead to an accident if not accounted for. Also, driving over potholes or rocks could be uncomfortable for the passengers. There are many ways for estimating the road condition, such as computer vision algorithms and specialized measurement devices.

What Are the Other Road Users Doing?

Predicting the future behavior of other traffic participants is a key capability for autonomous vehicles. This capability is built from three main components: object detection, object tracking, and movement prediction.

In order to predict what other road users are doing, the first step is to identify where they are. This step is called object detection, and it is solved by combining observations from both cameras and lidar sensors, using various computer vision and sensor fusion algorithms, such as deep neural networks. As a result, we get estimates of where nearby cars, pedestrians and other road users are, relative to the autonomous vehicle.

The next step is to estimate how the other road users are currently moving (object tracking), and predict how they will likely move in the future (movement prediction). This can be done using a sequence of observations of their locations, and a model that predicts how a car or a pedestrian will most likely move in the future. We can take advantage of our knowledge of the traffic participant type in this prediction, as pedestrians can change their direction of movement much faster than cars, and cars tend to move at a much higher velocity than pedestrians.

Where Are My Blind Spots?

Understanding the limitations of one’s knowledge is important in traffic. It is common that other vehicles, buildings and road landscape limit the visibility of the autonomous vehicle. For example, if a large vehicle is parked next to a zebra crossing, it is important for the autonomous vehicle to understand that a pedestrian may be hidden from its view, and thus it needs to slow down in advance.

In order to drive carefully, the autonomous vehicle needs to be able to quantify its uncertainty about the state of the environment. To illustrate what this means in practice, let’s consider object detection. Here a naïve system might only estimate the single most likely location for each traffic participant. In contrast, a system that quantifies its uncertainty could also explain how certain it is of each predicted location, for example, by giving out a confidence interval for the location. This capability is especially important in challenging driving conditions. For example, when the autonomous vehicle encounters heavy rain or snowfall, predicting the exact locations of nearby traffic participants becomes more difficult, and the estimates become more noisy. In these situations, it is important that the AI is able to evaluate how much it can trust different predictions, to keep driving safely.

This principle can also be used with other models aside from perception. For example, when it comes to vehicle control, it is often assumed that all four wheels of the vehicle are working properly. This is of course true almost always, except when a tire gets punctured. Now, if the AI doesn’t take into account that the model it uses for controlling the vehicle might rarely be wrong, it might try to keep on driving forwards even in abnormal situations. On the other hand, if the AI is constantly estimating whether the models it uses are working correctly, it is able to estimate when the models can be trusted, and when it should stop and request maintenance to take over instead.

What Next?

We at Sensible 4 are developing the AI-solutions to make any vehicle autonomous. Our robust algorithms are already working in all weather conditions, for example in Norway.

Our main product, Dawn, is being launched in the year 2022. It’s the first truly all-weather autonomous driving software in the market.

If you would be interested in the company, our software and products or the related research, please feel free to contact us:

Written by Dr. Antti Kangasrääsiö, Head of Research at Sensible 4