Cornell team develops new inexpensive stereo-camera method to detect objects for self-driving cars; pseudo-LiDAR
Cornell researchers have shown that using two inexpensive cameras on either side of a vehicle’s windshield can detect objects with nearly LiDAR’s accuracy and at a fraction of the cost.
The LiDAR sensors currently used to detect 3D objects in the paths of autonomous cars are bulky, ugly, expensive, energy-inefficient—and highly accurate. These sensors are affixed to cars’ roofs, where they increase wind drag, a particular disadvantage for electric cars. They can add around $10,000 to a car’s cost. But despite their drawbacks, most experts have considered LiDAR sensors the only plausible way for self-driving vehicles to safely perceive pedestrians, cars and other hazards on the road.
The Cornell team found that analyzing the captured images from a bird’s-eye view rather than the more traditional frontal view more than tripled their accuracy, making stereo camera a viable and low-cost alternative to LiDAR.
Proposed two-step pipeline for image-based 3D object detection. Given stereo or monocular images, the researchers first predict the depth map, followed by transforming it into a 3D point cloud in the LiDAR coordinate system. They refer to this representation as pseudo-LiDAR, and process it exactly like LiDAR—any LiDAR-based 3D objection algorithms thus can be applied. Wang et al.
A paper on their work—“Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving”—will be presented at the 2019 Conference on Computer Vision and Pattern Recognition, 15-21 June in Long Beach, California.
One of the essential problems in self-driving cars is to identify objects around them—obviously that’s crucial for a car to navigate its environment. The common belief is that you couldn’t make self-driving cars without LiDARs. We’ve shown, at least in principle, that it’s possible.—Kilian Weinberger, associate professor of computer science and senior author of the paper
LiDAR sensors use lasers to create 3D point maps of their surroundings, measuring objects’ distance via the speed of light. Stereo cameras, which rely on two perspectives to establish depth, as human eyes do, seemed promising. But their accuracy in object detection has been woefully low, and the conventional wisdom was that they were too imprecise.
First author Yan Wang, a doctoral student in computer science, and collaborators took a closer look at the data from stereo cameras. They found that their information was nearly as precise as LiDAR. The gap in accuracy emerged, they found, when the stereo cameras’ data was being analyzed.
For most self-driving cars, the data captured by cameras or sensors is analyzed using convolutional neural networks—a kind of machine learning that identifies images by applying filters that recognize patterns associated with them. These convolutional neural networks have been shown to be very good at identifying objects in standard color photographs, but they can distort the 3D information if it’s represented from the front. So when Wang and colleagues switched the representation from a frontal perspective to a point cloud observed from a bird’s-eye view, the accuracy more than tripled.
When you have camera images, it’s so, so, so tempting to look at the frontal view, because that’s what the camera sees. But there also lies the problem, because if you see objects from the front then the way they’re processed actually deforms them, and you blur objects into the background and deform their shapes.—Yan Wang
Ultimately, Weinberger said, stereo cameras could potentially be used as the primary way of identifying objects in lower-cost cars, or as a backup method in higher-end cars that are also equipped with LiDAR.
The self-driving car industry has been reluctant to move away from LiDAR, even with the high costs, given its excellent range accuracy—which is essential for safety around the car. The dramatic improvement of range detection and accuracy, with the bird’s-eye representation of camera data, has the potential to revolutionize the industry.—Mark Campbell, the John A. Mellowes ’60 Professor and S.C. Thomas Sze Director of the Sibley School of Mechanical and Aerospace Engineering and a co-author
The research was partly supported by grants from the National Science Foundation, the Office of Naval Research and the Bill and Melinda Gates Foundation.
Yan Wang, Wei-Lun Chao, Divyansh Garg, Bharath Hariharan, Mark Campbell, and Kilian Q. Weinberger (2019) “Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving”