June-2019 Release Notes (v2019-05-30)
over 5 years ago by Ryan Fox Squire
*Warning: These release notes may lead to feelings of overwhelming excitement followed by light-headedness. You'll swoon for June.
Enhancements - Core Places and Brands
- Last month SG Places had 4,683,512 places. This month SG Places has 4,781,046 places (net +97,534 places) 📈
- Increased POI coverage and NAICS code accuracy for a variety of categories including:
- Full Service Restaurants (NAICS
722511
, +107,850 places) 🍴 - Elementary and Secondary Schools (NAICS
611110
, + 91,138 places) 🏫 - Child Day Care Services (NAICS
624410
, + 65,202 places) 🚸 - Physical Fitness Facilities (NAICS
713940
, + 25,807 places) 🏃 - Malls (NAICS
531120
, +5,902 places) 🐀 - Museums (NAICS
712110
, +4,761 places) 🎨 - Colleges, Universities, and Professional Schools (NAICS
611310
, + 560 places) 🏫 - An inquisitive mind will notice that these numbers sum to > the total net new places (97.5k, see first bullet point). That is because not all these are new safegraph_place_ids. They may be existing safegraph_place_ids that previously did not have a naics code or had an incorrect naics code.
- Full Service Restaurants (NAICS
- We've added net 160 new brands 🎊 including:
- Western Union (westernunion.com, SG_BRAND_9ee39f394d21a7f4848ab78a78da00c3) with 6,222 places.
- First Convenience Bank (FCB) (https://www.1stnb.com/, SG_BRAND_e7ee3cc7b8911396010954272237018a) with 314 places.
- Agent Provocateur (agentprovocateur.com, SG_BRAND_30a1d306114e84e9) with 11 places.
- Gambino's Pizza (gambinospizza.com, SG_BRAND_146f1ed78b324031) with 45 places.
- and 156 more!! 📈
Bugs and Known Issues - Core Places and Brands
- Bad SGPID Churn -- Bad sgpid churn are undesired failures to maintain consistent safegraph_place_ids (sgpids) between releases (see discussion in March 2019 release). We internally track and estimate our performance in this domain and share these numbers in our release notes for maximum transparency. In this release:
- We dropped 152,389 sgpids (32,176 branded and 120,213 non-branded).
- We added 249,923 sgpids (61,097 branded and 188,826 non-branded).
- Some percent of these are true openings and closings; the remainder are bad sgpid churn. We are working on better metrics for distinguishing the two cases.
- NB: These numbers are significantly improved from latest May release
- Category Fill Rate We monitor category fill rate with 3 metrics: (1) category fill rate across the entire dataset, (2) category fill rate for branded POI, (3) category fill rate in the brand_info file (brand-level categories). We want all of these numbers to be 100%.
- (1) All POI category fill rate. Last month 89%. This month 91%. 👍 👍 We promised to do better and we did! But we aren't satisfied yet!
- (2) Branded POI category fill rate. Last month 100%. This month 100% 💯
- (3) Brand-level category fill rate (brand_info file). Last month 100%. This month 100%. 💯
Enhancements - Geometry
- Improved and additional cartography and polygons. New or improved polygon geometries :diamond-shape: for over 33,000 buildings including over 11,000 strip malls (continued effort from May Release, see May release notes for examples of better strip mall maps)
- Improved internal geocoding systems. We don't expect our customers to be able to tell in the short-run, but we've been making some heavy investments 💪 in our internal geocoding systems to ensure that SafeGraph continues to have the most accurate POI centroids on the market.
Bugs and Known Issues - Geometry
- Centroid-Radius Polygons -- As discussed in March 2019 release notes. We internally track centroid-radius polygons vs precise polygons and strive for 100% precise polygons.
Here is how we are tracking on that metric over recent releases: Centroid-Radius Polygon Tracking. Reader Survey: We have been exploring ways to make whether or not a polygon is synthetic explicit in our Geometries product. If this is something you'd like, please let us know! - We fixed a bug 🐛 with the column
polygon_class
in which we were incorrectly labeling some polygons as OWNED when actually they are SHARED. Due to this bug fix it appears as though we have a surge inpolygon_class
=SHARED
when in reality the data inpolygon_wkt
hasn't dramatically changed, just the value ofpolygon_class
. So, the metric over the last 3 releases (April, May, June) the number of POI with polygon_class =SHARED
went from 1.839 M in April to 1.732 M in May and now 2.001 M in June (even though the actual polygon_wkt data has in reality been getting better and better every month due to our ongoing cartography efforts. Thanks to our excellent partners who alerted us to this issue by politely informing us that some safegraph_place_ids markedOWNED
had identicalpolygon_wkt
s.
Also check out these new ways to get SafeGraph data:
* Are you an Esri or ArcGIS user? Checkout [SafeGraph Places in the Esri Marketplace](https://marketplace.arcgis.com/listing.html?id=3425348e4bee4059af2b353e52df43c2).
* Need some extra data on other SafeGraph products? Checkout the [SafeGraph Data Bar.](https://shop.safegraph.com/)
* Or just drop us a line! Good Data is Happy Data!