Welcome πŸ‘‹

Whether you’re a global enterprise, startup, or academic, learn how SafeGraph can improve your data science models.

Docs    Places API

January-2021 Release Notes

8 months ago by [email protected]

We made it! With 2020 in the rear view mirror, we are thrilled to kick off the new year with a fresh serving of places data πŸ’₯. Welcome to the January 2021 release notes (2020-12-30/1609326198 shipped 2021-01-06).

Highlights

  • Placekey makes its Canadian debut :key+: πŸ‡¨πŸ‡¦
  • Warehouses, Fulfillment, and Logistics, Oh My! πŸ“¦

Table of Contents:

Core Places and Brands

Enhancements - Core Places and Brands

  • placekey and parent_placekey columns are now populated for Canadian POIs! :maple-leaf+: :tada+: See the October-2020 release notes to learn more about Placekey and why we've embraced it as our unique and persistent ID going forward.

  • Last month SG Places had 6,874,427 points-of-interest (including closed POIs). This month SG Places has 6,996,247 points-of-interest (net +121,820 places). These are +111,316 US Places and +10,504 CA places.

  • We've added +13 brands including +5 General Warehousing and Storage (493110)πŸ“¦ and +2 Couriers and Express Delivery Services (492110) :truck+:
    New Brands Include...

    • DHL (SG_BRAND_53c6f3cca240d98b5e648d93115a2426) with 3,935 US and 435 CA places.
    • UPS Fulfillment Center (SG_BRAND_8def7a58bd9b334b) with 101 US and 7 CA places.
    • Walmart Distribution (SG_BRAND_89448bb0e8f0f599), parent brand: (Walmart inc, SG_BRAND_f49758461e68a211) with 161 US and 0 CA places.
    • Target Distribution (SG_BRAND_78b87cf57fd9d6a9), parent brand: (Target, SG_BRAND_42aefbae01d2dfd981f7da7d823d689e) with 46 US and 0 CA places.
    • XPO (SG_BRAND_aff8016fa2dcc87c) with 739 US and 31 CA places.
    • and 8 more!

Brand Openings and Closings

  • We rely on POI metadata to track store openings and closings, and we are especially interested in understanding open/close dates for branded POIs.

Bug Fixes and Known Issues - Core Places and Brands

  • We discovered a few brand count fluctuations as a result of updated sourcing and other metadata bugs. These corrections resulted in significant changes in the total number of POIs for each affected brand, but the new count is correct. For transparency, we'd like to list some of these corrections as examples in no particular order:

    • Tri Counties Bank (SG_BRAND_2e511958d4d180bd). Net POI count change: US: -101 CA: 0. Bug: Previously included ATMs.
    • Tire Factory (SG_BRAND_bc406a64fab678239c6554ce7ec498c6). Net POI count change: US: -128 CA: 0. Bug: Previously included affiliates.
    • Beard Papa (SG_BRAND_34a4826e2c2094cd). Net POI count change: US: -29 CA: 0. Bug: "Beard Papas" (SG_BRAND_6361da4998eb4d40) already exists as a brand.
    • SuperAmerica (SG_BRAND_c662514f0034e025a5af8c9cd2c042e7). Net POI count change: US: -257 CA: 0. Bug: Acquired by "Speedway" (SG_BRAND_93b3d23529ebed78cdf4bfe2e58f1979). Dropped POIs are duplicates with existing Speedway POIs.
    • Sterling Auto (SG_BRAND_81f69714318b9a3b21f8c43081dd8e98). Net POI count change: US: -336 CA: 0. Bug: Duplicates with existing brand "Service King" (SG_BRAND_5bf6feaed2572937a3b05373dd790150).

Enhancements - Categories

  • By popular demand, we continue to focus on sourcing more industrial POIs :factory+:. Below are some noteworthy category count changes:
    • General Warehousing and Storage (493110). Net POI count change: US: +915 CA: +27 πŸ“¦
    • Couriers and Express Delivery Services (492110). Net POI count change: US: +4,843 CA: +460 :truck+:
    • Commercial Banking (522110). Net POI count change: US: -14,661 CA: -797. Thousands of non-branded POIs were previously classified as commercial banks but most are actually credit unions :bank+:.
    • Credit Unions (522130). Net POI count change: US: +13,470 CA: +775. Mostly due to the commercial banking correction described above.

Category Fill Rate -- We monitor category fill rate with 3 metrics: (1) category fill rate across the entire dataset, (2) category fill rate for branded POI, (3) category fill rate in the brand_info file (brand-level categories). We want all of these numbers to be 100%.

  • (1) All POI category fill rate. Last month 99.2%. This month 99.2%.
  • (2) Branded POI category fill rate. Last month 100%. This month 100% :100+:
  • (3) Brand-level category fill rate (brand_info file). Last month 100%. This month 100% :100+:

Drops ⬇️

  • We constantly ingest data from new sources, and many safegraph_place_ids (sgpids) are intentionally dropped, but we are unable to track each and every dropped sgpid. The following metrics track safegraph_place_id drop reasons across open and closed POIs.

    • We dropped 41,123 sgpids (17,802 branded and 23,321 non-branded).
      • ~19k dropped due to POI source upgrades
      • ~3k dropped due to standardizing messy street addresses
      • ~2k dropped as a result of bug fixes for branded POIs :bug+:
      • ~4k dropped as a result of deduplication πŸ‘―β€β™‚οΈ
  • The remaining drops are undesired failures to maintain a consistent sgpid between releases - known as bad sgpid churn (see discussion in March 2019 release). We are continuing to work on better metrics to distinguish good vs. bad churn.

Geometry

Enhancements - Geometry

  • While OWNED polygons are preferred, it does not mean that SHARED polygons are inherently bad. It only means that the exact shape of each POI within the polygon is not discernible, but the general location can be identified by the centroid (latitude & longitude). 🎯

  • When enclosed = FALSE, it indicates that there are reasonable means to derive a unique polygon for the POI (even when parent_safegraph_place_id is not null), and we strive for 100% of branded, non-enclosed POIs to have polygon_class = "OWNED_POLYGON."

Bug Fixes and Known Issues - Geometry

Patterns

Enhancements - Patterns

  • In last month's delivery, SG Patterns had 4,393,086 points-of-interest (US only). This month, SG Patterns has 4,412,421 points-of-interest (US only) (net + 19,335). :chart-with-upwards-trend+:

  • Last month, SG Patterns had 778,401,379 visits from 35,531,210 visitors. This month, SG Patterns has 818,360,020 visits from 35,436,537 visitors (delta +39,958,641 visits, -94,673 visitors).

  • Interested in foot traffic patterns for Canadian POIs? Reach out to your customer success manager for a sample! πŸ‡¨πŸ‡¦

  • Want to see more generalized foot traffic flows? Check out SafeGraph Neighborhood Patterns πŸš— :city-sunset+:

~~~~

In case you missed it, check out last month's release notes. πŸ“

Calculating Diffs
Curious to find the specific records that were either added, deleted, or saw an attribute change from one release to the next? Visit "Calculating Diffs" in our Data Science Resources to get started.

Fill Rates
See the Summary Statistics page for all Core and Geometry column fill rates as well as a breakdown of POI count by naics_code.

Explore
Browse SafeGraph Core & Geometry data at your own pace in these webmaps.

Also check out these new ways to get SafeGraph data: