November-2020 Release Notes
Welcome to the November 2020 release. We're thankful to serve up a squeaky clean delivery of Places data 🦃 (2020-10-28/1603882929 shipped 2020-11-05).
Highlights
- +76 new brands 🎊
- Record low levels of
safegraph_place_id
churn 💯
Table of Contents:
Enhancements - Core Places and Brands
-
Last month SG Places had 5,933,243 points-of-interest. This month SG Places has 6,160,574 points-of-interest (net + 227,331 places). These are +224,741
US
Places and +2,590CA
places. -
We've added +76 brands including +25 Gasoline Stations with Convenience Stores ⛽️ and +10 Commercial Banking 🏦
New Brands Include...- Worldmark by Wyndham ((wyndhamhotels.com/hotel-deals/wbw), SG_BRAND_bdcca340a8c58a10), parent brand: (Wyndham Hotels and Resorts, SG_BRAND_0aba43027dab3087acff1cb8f463fa22) with 134 US and 10 CA places.
- Arvest Bank ((arvest.com), SG_BRAND_1c6d6c909e0cfe06231244705c6defdb) with 267 US and 0 CA places.
- Finance of America Mortgage ((foamortgage.com), SG_BRAND_cf94978cc72d2695) with 214 US and 0 CA places.
- Valvoline Express Care ((expresscare.com), SG_BRAND_dc39cf07bbbb0f88), parent brand: (Valvoline, SG_BRAND_39e16e9497a95ca0) with 256 US and 19 CA places.
- GoLo ((gologas.com), SG_BRAND_caf1c8f9222e9c6d) with 86 US and 0 CA places. gologas.com.
- and 71 more!
Bug Fixes and Known Issues - Core Places and Brands
-
We discovered a few brand count fluctuations as a result of updated sourcing and other metadata bugs. These corrections resulted in significant changes in the total number of POIs for each affected brand, but the new count is correct. For transparency, we'd like to list some of these corrections as examples in no particular order:
- Habitat (
SG_BRAND_78de657ea66ed5e1ca8e746054ec250c
). Net POI count change: US: -702 CA: 0. Bug: Created child brand for "Habitat for Humanity ReStore" (SG_BRAND_b4f10df5ee241c57). - Giant Convenience Stores (
SG_BRAND_2f3a0cec435bfb6b
). Net POI count change: US: -249 CA: 0. Bug: This brand has been deprecated. Locations have actually been attached to brand "Speedway" (SG_BRAND_93b3d23529ebed78cdf4bfe2e58f1979) since 2019 -- for which we have all POIs. - Highs Dairy Stores (
SG_BRAND_4504a971d37aaa23
). Net POI count change: US: -49 CA: 0. Bug: This brand has been deprecated as it is redundant/duplicate with "High's" (SG_BRAND_dcd331cebf10f17f). - PlainsCapital Bank (
SG_BRAND_2d3b83c8d86a9869f133e566dd13f9b6
). Net POI count change: US: -24 CA: 0. Bug: Previously included ATMs. - Advance Auto Parts (
SG_BRAND_f6690ed6fac1b97d75d2ea16f2eb0e6d
). Net POI count change: US: 890 CA: 0. Bug: Previously incorrectly marked locations as closed.
- Habitat (
Enhancements - Categories
- By popular demand, we increased our coverage for gas station and convenience store POIs, and we ironically added a spooky dose of cemeteries just in time for Halloween 🎃. The top 3 count increases by category are as follows:
- Cemeteries and Crematories (812220). Net POI count change: US: + 123,229 CA: -1 ⚰️
- Gasoline Stations with Convenience Stores (447110). Net POI count change: US: +23,291 CA: +1,742. ⛽️
- Offices of Real Estate Agents and Brokers (531210). Net POI count change: US: +5,511 CA: -10. 🏠
Category Fill Rate -- We monitor category fill rate with 3 metrics: (1) category fill rate across the entire dataset, (2) category fill rate for branded POI, (3) category fill rate in the brand_info file (brand-level categories). We want all of these numbers to be 100%.
- (1) All POI category fill rate. Last month 99.2%. This month 99.2%.
- (2) Branded POI category fill rate. Last month 100%. This month 100% 💯
- (3) Brand-level category fill rate (brand_info file). Last month 100%. This month 100% 💯
Drops ⬇️
-
We constantly ingest data from new sources, and many
safegraph_place_ids
(sgpids) are intentionally dropped, but we are unable to track each and every dropped sgpid. In this release:- We dropped 16,665 sgpids (7,501 branded and 9,164 non-branded).
- 3k dropped due to POI source fluctuations
- 1.5k dropped as a result of bug fixes for branded POIs 🐛
- 6k dropped as a result of deduplication 👯♂️
- 3.5k dropped due to permanent closures ❌ (dropped but not lost -- you will still see these POIs if you get the open/close columns).
-
The remaining drops are undesired failures to maintain a consistent sgpid between releases - known as bad sgpid churn (see discussion in March 2019 release). We are continuing to work on better metrics to distinguish good vs. bad churn.
-
Last month, we cofounded the Placekey initiative and added
placekey
as a unique and persistent identifier for all POIs in the SafeGraph dataset. See here for more on how Placekey is unlocking access to geospatial data across industries.
Enhancements - Geometry
-
While OWNED polygons are preferred, it does not mean that SHARED polygons are inherently bad. It only means that the exact shape of each POI within the polygon is not discernible, but the general location can be identified by the centroid (
latitude
&longitude
). 🎯 -
When
enclosed
= FALSE, it indicates that there are reasonable means to derive a unique polygon for the POI (even whenparent_safegraph_place_id
is not null), and we strive for 100% of branded, non-enclosed POIs to have polygon_class = "OWNED_POLYGON." -
Last month, the percent OWNED polygons for branded, non-enclosed POIs was 80.0%
-
This month, the percent OWNED polygons for branded, non-enclosed POIs is 79.5% 📉
- Here is how we're tracking on this metric across releases: OWNED vs SHARED Polygons in SafeGraph Places Release History.
- See the September-2020 release notes for details about the
enclosed
column and tweaks to this metric.
Bug Fixes and Known Issues - Geometry
- Centroid-Radius Polygons -- As discussed in March 2019 release notes. We internally track centroid-radius polygons vs precise polygons and strive for 100% precise polygons. You can measure this yourself using the
is_synthetic
column.- This release, we saw a decrease to 94.2% precise polygons (95.8% last month).
- This is primarily due to the +123k cemetery POIs which do not have precise polygons. For some categories, it does not make sense to source a precise polygon. See here for a running list of
naics_codes
that are purposely left as synthetic. - Here is how we are tracking on this metric across releases: Centroid-Radius Polygon Tracking.
Enhancements - Patterns
-
In last month's delivery, SG Patterns had 4,095,560 points-of-interest (US only). This month, SG Patterns has 4,186,911 points-of-interest (US only) (net +91,351 ). 📈
-
Last month, SG Patterns had 850,573,530 visits from 35,756,143 visitors. This month, SG Patterns has 899,790,036 visits from 34,787,736 visitors (delta + 49,216,506 visits, -968,407 visitors).
**In case you missed it,** check out [last month's release notes](https://docs.safegraph.com/changelog/october-2020-release-notes). 📝
**Calculating Diffs**
Curious to find the specific records that were either **added, deleted, or saw an attribute change** from one release to the next? Visit "Calculating Diffs" in our [Data Science Resources](https://docs.safegraph.com/docs/data-science-resources#section-calculating-diffs) to get started.
**Fill Rates**
See the [Summary Statistics](https://docs.safegraph.com/docs/places-summary-statistics) page for all Core and Geometry column fill rates as well as a breakdown of POI count by `naics_code`.
**Explore**
Browse SafeGraph Core & Geometry data at your own pace [in these webmaps.](https://storymaps.arcgis.com/stories/8e5e066486f94f0ea698e507d46987f7)
**Also check out these new ways to get SafeGraph data: **
* Need some extra data or other SafeGraph products? Check out the [SafeGraph Data Bar.](https://shop.safegraph.com/)
* Heavy AWS User? Check out our [listings in the AWS Data Exchange](https://aws.amazon.com/marketplace/search/results?filters=vendor_id&vendor_id=7d5ff8ca-105f-4856-9d99-5f2f1d83223c).
* Are you an Esri or ArcGIS user? Check out our FREE data [SafeGraph Places in the Esri Marketplace](https://marketplace.arcgis.com/listing.html?id=3425348e4bee4059af2b353e52df43c2) and enjoy [SafeGraph Places in Esri Basemaps](https://www.esri.com/arcgis-blog/products/arcgis-living-atlas/mapping/new-places-in-esri-vector-basemaps/).
* Snowflake user? Check out our page on the [Snowflake Data Exchange](https://www.snowflake.com/datasets/safegraph/) :snowflake:
* Or just drop us a line! Your data needs are our data delights!