General
What is the Mapbox Vision SDK?
What is the Mapbox Vision SDK?
The Mapbox Vision SDK is a tool developers use to build a better driving experience. The SDK processes images captured by connected cameras on mobile phones, dash cameras, or embedded navigation systems. The SDK uses this data to understand the driver’s environment and deliver driver assistance and AR navigation features.
What can I do with the Vision SDK?
The Vision SDK lets developers see the view from the driver’s seat. The SDK pairs augmented reality and artificial intelligence into one lightweight, multi-platform solution you can use to engineer a better driving experience. Build custom AR navigation experiences, classify and display regulatory and warning signs, trigger driver alerts for nearby vehicles, cyclists, and pedestrians, and more.
How does the Vision SDK tie into the rest of Mapbox’s products and services?
Mapbox’s live location platform incorporates dozens of different data sources to power our maps. Map data originates from sensors as far away as satellites and as close up as street level imagery. Conventionally, collected imagery requires extensive processing before a map can be created or updated. The innovation of the Vision SDK is its ability to process live data with distributed sensors, keeping up with our rapidly changing world. Developers will be able to use this new ability to create richer, more immersive experiences with Mapbox maps, navigation, and search.
Using Vision
In which regions is the Vision SDK supported?
In which regions is the Vision SDK supported?
Semantic segmentation, object detection, and following distance detection will work on virtually any road. The core functionality of the augmented reality navigation with turn-by-turn directions is supported globally. AR navigation with live traffic is supported in
over 50 countries, covering all of North America, most of Europe, Japan, South Korea, and several other markets. Sign classification is currently optimized for North America, with some limited support in other regions. Sign classification for additional regions is under development.
Can the Vision SDK read all road signs?
The latest version of the Vision SDK recognizes over 200 of the most common road signs today, including speed limits (5 - 120 mph or kph), regulatory signs (merges, turn restrictions, no passing, etc.), warning signs (traffic signal ahead, bicycle crossing, narrow road, etc.), and many others. The Vision SDK does not read individual letters or words on signs, but rather learns to recognize each sign type holistically. As a result, it generally cannot interpret guide signs (e.g. “Mariposa St. Next Exit”). We’re exploring Optical Character Recognition (OCR) as a future release.
What are the requirements for calibration?
AR navigation and Safety mode require calibration, which takes 20-30 seconds of normal driving.(Yourdevice will not be able to calibrate without being mounted.) Because the Vision SDK is designed to work with an arbitrary mounting position, it needs this short period of calibration when it’s initialized to be able to accurately gauge the locations of other objects in the driving scene. Once calibration is complete, the Vision SDK will automatically adjust to vibrations and changes in orientation while driving.
What is the best way for users to mount their devices when using the Vision SDK?
The Vision SDK works best when your device is mounted either to the windshield or the dashboard of your vehicle with a good view of the road. We’ve tested a lot of mounts; here are a few of our favorites:
Some things to consider when choosing and setting up a mount:
-
Generally, shorter length mounts will vibrate less. Mounting to your windshield or to the dashboard itself are both options.
-
The Vision SDK will work best when the phone is near/behind where your rear view mirror is, but please note your local jurisdiction’s limits on where mounts may be placed.
-
Make sure the camera view is unobstructed (youwill be able to tell with any of the video screens open).
Can I use the Vision SDK with an external camera?
Yes. Beginning with public beta, developers will be able to connect Vision-enabled devices to remote cameras. Image frames from external cameras can be transmitted over WiFi or via a direct connection.
Will the Vision SDK drain my battery?
The Vision SDK consumes CPU, GPU and other resources to process road imagery on-the-fly. Just as with any other navigation or video application, we recommend having your phone plugged in if you are going to use it for extended periods of time.
Can I rely on the Vision SDK to make driving decisions?
No. The Vision SDK is designed to provide context to aid driving, but does not replace any part of the driving task. During beta, feature detection is still being tested, and may not detect all hazards.
Can I use the Vision SDK to make my car drive itself?
No. The Vision SDK can be used to issue safety alerts and provide augmented reality navigation instructions and other features, but does not make any driving decisions.
Will my device get hot if I run the Vision SDK for a long time?
Phones and other IoT devices will get warmer over time as the onboard AI consumes a decent amount of resources. However, we have not run into any heat issues with moderate-to-heavy use.
Will the Vision SDK work in countries that drive on the left?
Yes.
Does the Vision SDK work at night?
The Vision SDK works best under good lighting conditions. However, it does function at night, depending on how well the road is illuminated. In cities with ample street lighting, for example, the Vision SDK still performs quite well.
Does the Vision SDK work in the rain and/or snow?
Yes. Just like human eyes, however, the Vision SDK works better the better it can see. Certain features, such as lane detection, will not work when the road is covered with snow.
Tech
What is the difference between detection and segmentation?
What is “classification”?
In computer vision, classification is the process by which an algorithm identifies the presence of a feature in an image. For example, the Vision SDK classifies whether there are certain road signs in a given image.
What is “detection”?
In computer vision, detection is similar to classification - except instead of only identifying whether a given feature is present, a detection algorithm also identifies where in the image the feature occurred. For example, the Vision SDK detects vehicles in each image, and indicates where it sees them with bounding boxes. The Vision SDK supports the following detection classes: cars (or trucks), bicycles/motorcycles, pedestrians, traffic lights, and traffic signs.
What is “segmentation”?
In computer vision, segmentation is the process by which each pixel in an image is assigned to a different category, or “class”. For example, the Vision SDK analyzes each frame of road imagery and paints the pixels different colors corresponding its the underlying class. The Vision SDK supports the following segmentation classes: buildings, cars (or trucks), curbs, roads, non-drivable flat surfaces (such as sidewalks), single lane markings, double lane markings, other road markings (such as crosswalks), bicycles/motorcycles, pedestrians, sky, and unknown.
What is the difference between detection and segmentation?
Detection identifies discrete objects(e.g.,individual vehicles) and draws bounding boxes around each one that is found. The number of detections in an image changes from one image to the next, depending on what appears. Segmentation, on the other hand, goes pixel-by-pixel and assigns each to a different category. For a given segmentation model, the same number of pixels are classified and colored in every image. Features from segmentation can be any shape describable by a 2-d pixel grid, while features from object detection are indicated with boxes defined by four pixels making up the corners.
Where does calibration happen?
Calibration is handled in the VisionCore module. VisionCore uses camera, IMU, and GPS to calibrate itself for best performance of Vision features.