This document summarizes the recommended methods and results you should obtain when evaluating a points-of-interest (POI) data set (e.g. SafeGraph Places data). To see the quantitative results of this evaluation, please contact us and we are happy to provide results on these metrics (or give you access to data to confirm these metrics yourself).
This evaluation addresses the three major quality categories when evaluating point of interest (POI) data: precision, recall, and completeness. Methodology and detailed results for each metric can be found in their corresponding sections.
Precision
Good results should be 0 - 10 meters away from truth set (Google Maps).
Precision
Good results should show > 70% of tested polygons are true to building footprint as represented by truth set (Google Maps).
Precision
Good results should show > 99.9% of POI attributes for top brands are accurate as compared to truth set (online store locators).
Recall
Good results should show POI counts for brands/chains are within 0 - 2% of tested truth set (store locators).
Completeness
Good results should show high fill rates for important attributes
Category: > 90%
Phone Number: > 70%
Open Hours: > 50%
These should be even higher for major brands (chains).
Precision: Latitude and Longitude Accuracy
Are SafeGraph Places actually located where they purport to be?
Methodology
Every POI in the SafeGraph Places dataset includes columns for the interpolated latitude
& longitude
values for a POI. A coordinate accuracy measurement compares the SafeGraph coordinate values to an accepted coordinate truth set (Google Maps).
To measure the distance between SafeGraph and Google POI coordinates, we recommend the Google Places API to make Find Places requests for all POI in the SafeGraph dataset. More specifically, we provide the address for all SafeGraph POIs and compare the returned Google coordinates to the associated SafeGraph POI coordinates. The distance between coordinates is measured in meters.
Results
In aggregate, we find that the median distance between SafeGraph and Google Maps coordinates for all SafeGraph POIs is very small (usually 0-5m). The distribution of POI distance from Google Maps is presented below:
In contrast, we've found that other POI data providers show centroid precision ranging from 18-65 meters in median distance from Google Maps with a mean median distance of 40 meters.
Precision: Polygon Accuracy
Do SafeGraph Places Polygons represent the exact shape of buildings?
Methodology
The SafeGraph Places dataset includes two fields that describe POI geometry:
polygon_wkt
: a polygon that represents the shape of the POI, formatted as Well-Known Text (WKT).polygon_class
: a field that describes whether the polygon describes the POI itself (owned_polygon
) or if the polygon is shared by more than one POI (shared_polygon
).
To measure the accuracy of polygons, filter to polygons that represent a single POI by only including owned_polygon
values for the polygon_class
. Select a random subset of (e.g. 1,000) POIs in the dataset for human verification. For each selected polygon, a tester can overlay the polygon on top of Google Maps and score in a binary manner whether a polygon accurately represented the shape of a building. A polygon can be determined as accurate when:
- The polygon represents the associated POI in the dataset. Inversely, a polygon is inaccurate if it was the correct shape of a building but associated with the wrong POI.
- The polygon accurately covers the building footprint of interest in both shape and size.
- If a POI is part of a larger structure (such as a strip mall), the polygon should accurately represent the shape and size of the individual store.
- Polygons were only determined to be accurate if they were within 2 meters of the Google Maps imagery as this discrepancy can be accounted for in differing pitches of satellite imagery.
When inaccurate, the polygons can be classified into the following inaccurate categories:
- Centroid: the tested data was a not a building polygon but rather an approximated circular polygon derived from the POI centroid with a radius applied
- Shape: the polygon was the wrong shape compared to the POI.
- Size: the polygon was either smaller or larger than the POI.
- Wrong Place: the polygon did not represent correct POI even if it was the correct shape and size of a building.
Examples of correct and incorrect polygons are shown below:
Accurate Examples

Accurate shape and size for selected building (within 2m) but wrong POI (address is for the other building).
Precision: Attribute Accuracy (Address, Phone Number, Open Hours)
Are POIs associated with accurate business information (address, phone number, open hours, etc.)?
Methodology
Each SafeGraph place includes the following business information:
location_name
street_address
city
state
zip_code
Most (see completeness results) SafeGraph Places also include:
naics_code
phone_number
open_hours
To estimate the accuracy of this business metadata, you can create a randomized subset of POI that includes all attributes of interest:
- e.g., Select 50 random brands from the dataset where their store count is greater than 1,000 stores nationally. Select 10 random stores for each of those brands where all attributes were included.
This randomized subset of branded POI can be compared to the data provided by online corporate websites for each of these brands by human verifiers. For example, the Lowe’s
brand can be tested against the truth set provided at https://www.lowes.com/store/.
The NAICS code for the 50 random brands selected can be verified by human judgment.
Recall: Total POI count for selected brands
Does SafeGraph Places include all POI for selected brands?
Methodology
To assess the accuracy of branded POI counts, generate a randomized sample of 20 safegraph_brand_ids
where store counts were greater than 1,000 stores nationally and measure the total count of POI for each brand. For each brand, the SafeGraph Places count can be compared to the count of stores listed on the brand’s store locator site. Note determining the number of stores listed on the brand’s store locator website may require building a custom website scraping solution.
Completeness: Attribute Counts
What coverage does SafeGraph places offer and what are the fill rates for POI attributes?
Methodology
For example, you may want to examine the completeness of data coverage for high-value attributes like:
naics_code
phone_number
open_hours
Fill rate is defined as the percentage of non-null values for the attribute of interest in the dataset which can be computed with a simple query.
Please see Places Summary Statistics for a complete list of attribute counts and fill rates for the latest SafeGraph Places release. We recommend examining fill rates both overall and for high-value major retail chains (brands).
Updated 2 months ago