November-2020 Release Notes

Welcome to the November 2020 release. We're thankful to serve up a squeaky clean delivery of Places data 🦃 (2020-10-28/1603882929 shipped 2020-11-05).

Highlights

  • +76 new brands 🎊
  • Record low levels of safegraph_place_id churn 💯

Table of Contents:

Enhancements - Core Places and Brands

  • Last month SG Places had 5,933,243 points-of-interest. This month SG Places has 6,160,574 points-of-interest (net + 227,331 places). These are +224,741 US Places and +2,590 CA places.

  • We've added +76 brands including +25 Gasoline Stations with Convenience Stores ⛽️ and +10 Commercial Banking 🏦
    New Brands Include...

    • Worldmark by Wyndham ((wyndhamhotels.com/hotel-deals/wbw), SG_BRAND_bdcca340a8c58a10), parent brand: (Wyndham Hotels and Resorts, SG_BRAND_0aba43027dab3087acff1cb8f463fa22) with 134 US and 10 CA places.
    • Arvest Bank ((arvest.com), SG_BRAND_1c6d6c909e0cfe06231244705c6defdb) with 267 US and 0 CA places.
    • Finance of America Mortgage ((foamortgage.com), SG_BRAND_cf94978cc72d2695) with 214 US and 0 CA places.
    • Valvoline Express Care ((expresscare.com), SG_BRAND_dc39cf07bbbb0f88), parent brand: (Valvoline, SG_BRAND_39e16e9497a95ca0) with 256 US and 19 CA places.
    • GoLo ((gologas.com), SG_BRAND_caf1c8f9222e9c6d) with 86 US and 0 CA places. gologas.com.
    • and 71 more!

Bug Fixes and Known Issues - Core Places and Brands

  • We discovered a few brand count fluctuations as a result of updated sourcing and other metadata bugs. These corrections resulted in significant changes in the total number of POIs for each affected brand, but the new count is correct. For transparency, we'd like to list some of these corrections as examples in no particular order:

    • Habitat (SG_BRAND_78de657ea66ed5e1ca8e746054ec250c). Net POI count change: US: -702 CA: 0. Bug: Created child brand for "Habitat for Humanity ReStore" (SG_BRAND_b4f10df5ee241c57).
    • Giant Convenience Stores (SG_BRAND_2f3a0cec435bfb6b). Net POI count change: US: -249 CA: 0. Bug: This brand has been deprecated. Locations have actually been attached to brand "Speedway" (SG_BRAND_93b3d23529ebed78cdf4bfe2e58f1979) since 2019 -- for which we have all POIs.
    • Highs Dairy Stores (SG_BRAND_4504a971d37aaa23). Net POI count change: US: -49 CA: 0. Bug: This brand has been deprecated as it is redundant/duplicate with "High's" (SG_BRAND_dcd331cebf10f17f).
    • PlainsCapital Bank (SG_BRAND_2d3b83c8d86a9869f133e566dd13f9b6). Net POI count change: US: -24 CA: 0. Bug: Previously included ATMs.
    • Advance Auto Parts (SG_BRAND_f6690ed6fac1b97d75d2ea16f2eb0e6d). Net POI count change: US: 890 CA: 0. Bug: Previously incorrectly marked locations as closed.

Enhancements - Categories

  • By popular demand, we increased our coverage for gas station and convenience store POIs, and we ironically added a spooky dose of cemeteries just in time for Halloween 🎃. The top 3 count increases by category are as follows:
  • Cemeteries and Crematories (812220). Net POI count change: US: + 123,229 CA: -1 ⚰️
  • Gasoline Stations with Convenience Stores (447110). Net POI count change: US: +23,291 CA: +1,742. ⛽️
  • Offices of Real Estate Agents and Brokers (531210). Net POI count change: US: +5,511 CA: -10. 🏠

Category Fill Rate -- We monitor category fill rate with 3 metrics: (1) category fill rate across the entire dataset, (2) category fill rate for branded POI, (3) category fill rate in the brand_info file (brand-level categories). We want all of these numbers to be 100%.

  • (1) All POI category fill rate. Last month 99.2%. This month 99.2%.
  • (2) Branded POI category fill rate. Last month 100%. This month 100% 💯
  • (3) Brand-level category fill rate (brand_info file). Last month 100%. This month 100% 💯

Drops ⬇️

  • We constantly ingest data from new sources, and many safegraph_place_ids (sgpids) are intentionally dropped, but we are unable to track each and every dropped sgpid. In this release:

    • We dropped 16,665 sgpids (7,501 branded and 9,164 non-branded).
    • 3k dropped due to POI source fluctuations
    • 1.5k dropped as a result of bug fixes for branded POIs 🐛
    • 6k dropped as a result of deduplication 👯‍♂️
    • 3.5k dropped due to permanent closures ❌ (dropped but not lost -- you will still see these POIs if you get the open/close columns).
  • The remaining drops are undesired failures to maintain a consistent sgpid between releases - known as bad sgpid churn (see discussion in March 2019 release). We are continuing to work on better metrics to distinguish good vs. bad churn.

  • Last month, we cofounded the Placekey initiative and added placekey as a unique and persistent identifier for all POIs in the SafeGraph dataset. See here for more on how Placekey is unlocking access to geospatial data across industries.

Enhancements - Geometry

  • While OWNED polygons are preferred, it does not mean that SHARED polygons are inherently bad. It only means that the exact shape of each POI within the polygon is not discernible, but the general location can be identified by the centroid (latitude & longitude). 🎯

  • When enclosed = FALSE, it indicates that there are reasonable means to derive a unique polygon for the POI (even when parent_safegraph_place_id is not null), and we strive for 100% of branded, non-enclosed POIs to have polygon_class = "OWNED_POLYGON."

  • Last month, the percent OWNED polygons for branded, non-enclosed POIs was 80.0%

  • This month, the percent OWNED polygons for branded, non-enclosed POIs is 79.5% 📉

Bug Fixes and Known Issues - Geometry

Enhancements - Patterns

  • In last month's delivery, SG Patterns had 4,095,560 points-of-interest (US only). This month, SG Patterns has 4,186,911 points-of-interest (US only) (net +91,351 ). 📈

  • Last month, SG Patterns had 850,573,530 visits from 35,756,143 visitors. This month, SG Patterns has 899,790,036 visits from 34,787,736 visitors (delta + 49,216,506 visits, -968,407 visitors).


**In case you missed it,** check out [last month's release notes](https://docs.safegraph.com/changelog/october-2020-release-notes). 📝

**Calculating Diffs**
Curious to find the specific records that were either **added, deleted, or saw an attribute change** from one release to the next? Visit "Calculating Diffs" in our [Data Science Resources](https://docs.safegraph.com/docs/data-science-resources#section-calculating-diffs) to get started. 

**Fill Rates**
See the [Summary Statistics](https://docs.safegraph.com/docs/places-summary-statistics) page for all Core and Geometry column fill rates as well as a breakdown of POI count by `naics_code`.

**Explore**
Browse SafeGraph Core & Geometry data at your own pace [in these webmaps.](https://storymaps.arcgis.com/stories/8e5e066486f94f0ea698e507d46987f7)

**Also check out these new ways to get SafeGraph data: **
  * Need some extra data or other SafeGraph products? Check out the [SafeGraph Data Bar.](https://shop.safegraph.com/) 
  * Heavy AWS User?  Check out our [listings in the AWS Data Exchange](https://aws.amazon.com/marketplace/search/results?filters=vendor_id&vendor_id=7d5ff8ca-105f-4856-9d99-5f2f1d83223c).
  * Are you an Esri or ArcGIS user? Check out our FREE data [SafeGraph Places in the Esri Marketplace](https://marketplace.arcgis.com/listing.html?id=3425348e4bee4059af2b353e52df43c2) and enjoy [SafeGraph Places in Esri Basemaps](https://www.esri.com/arcgis-blog/products/arcgis-living-atlas/mapping/new-places-in-esri-vector-basemaps/). 
  * Snowflake user? Check out our page on the [Snowflake Data Exchange](https://www.snowflake.com/datasets/safegraph/) :snowflake: 
  * Or just drop us a line! Your data needs are our data delights!