Can't beat the heat? 🥵 We're serving up an ocean of cool new data! Welcome to the July 2021 release notes (2021-06-10/1623334361 shipped 2021-07-07).
- +1 Patterns column
- +2 Brand Info columns
- -3 Core Places columns
- "POINT" POIs make their Places debut
- Core Places & Brands
- Geometry - Enhancements
- Geometry - Bug Fixes & Known Issues
- After 3+ years of uniquely identifying SafeGraph Places, we have officially retired
parent_safegraph_place_idinto SafeGraph lore. We learned a lot from building
safegraph_place_idand are excited to rely exclusively on
parent_placekeyas our unique and persistent ID moving forward 🔑 . Learn more about how Placekey is unlocking access to spatial data here.
- We also retired
tracking_opened_sinceas it did not provide additive value in our efforts to communicate open/close dates for places. If a POI has an
opened_onvalue, it implies we've been tracking it since that date, and if a POI does not have an
opened_onvalue, it implies we were not able to track the exact date it opened. See more about how we track new store openings and permanent store closures here.
- As we work to expand into new countries, it's useful to show which countries are covered for each brand. The Brand Info file now includes two additional columns detailing which countries we have at least one open POI (
iso_country_codes_open) and which countries we have at least one closed POI (
iso_country_codes_closed) for a particular brand 🌎 🌍. Please note that these new json columns have quotes escaped differently than other SafeGraph json columns, and you may notice some undesirable "" in the data depending on your tech stack . This will be corrected in the August release to align with how we have always escaped quotes for json columns.
- Last month, SG Places had 8,413,852 points-of-interest (including closed POIs). This month, SG Places has 8,638,522 points-of-interest (net + 224,670 places). These are +174,961
USPlaces 🇺🇸 , +14,027
CAplaces 🇨🇦, and +35,682
GBplaces 🇬🇧 .
- We've added 94 brands (+68 with 🇺🇸 coverage, +44 with 🇨🇦 coverage, +15 with 🇬🇧 coverage) including:
- Allpoint ATM (
SG_BRAND_11f4c85f01baedd5040bb96211cebbf1) with 38,517
GBPlaces, and 1,436
- Ziggi's Coffee (
SG_BRAND_d7dca165be7d62b2) with 33
- Americold Logistics (
SG_BRAND_8744f79535699202) with 194
CAPlaces, and 2
- +24 Commercial Banking (522110) -- 13 of which are
geometry_type= "POINT" ATM brands
- +7 Limited-Service Restaurants brands (722511)
- View the full list here
- Allpoint ATM (
We rely on POI metadata to track store openings and closings, and we are especially interested in understanding open/close dates for branded POIs. It can take more than a month to infer open/close dates, so we report brand open/close metrics on a one month delay.
In this release, we flagged 175 brands with at least one store closure in May 2021, and 197 brands with at least one store opening in May 2021.
- In the spirit of breaking traditions, we now provide unique types of places that are not defined by polygons in SafeGraph Geometry. We've added 182k "point-only" POIs to our Core Places offering. These include:
- 146k ATMs:
naics_code= 522110 (Commercial Banking)
- 36k electric vehicle charging stations:
naics_code= 447190 (Other Gasoline Stations) 🚗 ⚡️
- These premium rows are available upon request and are distinguished by having a "POINT" value in the new
geometry_typecolumn (positioned at the end of the Core Places schema). All traditional SafeGraph Places have "POLYGON" in the
- Stay tuned for additional "POINT" POIs in the works (kiosks and transit stops!) and reach out to your Customer Success Manager or contact sales to learn more .
We ingest data from many sources, and due to source changes and processing changes, Placekeys churn over time. In this release, we dropped 150,762 Placekeys (31,535 branded and 119,227 non-branded). To keep track of the status, predecessors, and latest successor of each Placekey, hit the Lineage API for free!
Major reasons for drops:
- ~3k dropped as result of improved deduplication
- ~121k dropped due to changes to the Where part as a result of polygon cleaning and source changes
- Read more about the structure of Placekeys here
- In June, we cleaned out more than 150k redundant, overlapping polygons to improve visualization and simplify visit attribution. 🧹 💯
- We also sourced new polygons for ultra prominent POIs (Disney Resorts, major casinos in Las Vegas, International airports, etc.) and built logic to ensure that these POIs NEVER lose their polygons or
placekeys🎰 . As an example, Walt Disney World Resort (
[email protected]) now has a polygon more than 20X its original size and contains 234 child POIs (previously, 1 child POI).
- While OWNED polygons are preferred, it does not mean that SHARED polygons are inherently bad. It only means that the exact shape of each POI within the polygon is not discernible, but the general location can be identified by the centroid (
enclosed= FALSE, it indicates that there are reasonable means to derive a unique polygon for the POI (even when
parent_placekeyis not null), and we strive for 100% of branded, non-enclosed POIs to have polygon_class = "OWNED_POLYGON."
- Last month, the percent OWNED polygons for branded, non-enclosed POIs was 74.5%
- This month, the percent OWNED polygons for branded, non-enclosed POIs is 74.4%
- Here is how we're tracking on this metric across releases: OWNED vs SHARED Polygons in SafeGraph Places Release History.
- See the September-2020 release notes for details about the
enclosedcolumn and tweaks to this metric.
We are aware of a single POI that did not behave as expected when sourcing a new polygon and establishing its permanence 😔 . Disney's Animal Kingdom (
[email protected]) took on a shape far too large for its grounds, and this will be corrected in the August 2021 release.
building_heightcolumn will be null until further notice. This column only had a ~25% fill rate, and due to some of the challenges in fill rate we are re-evaluating the best way to source and improve this data going forward .
Centroid-Radius Polygons -- As discussed in March 2019 release notes. We internally track centroid-radius polygons vs precise polygons and strive for 100% precise polygons. You can measure this yourself using the
Last release, the percent of precise polygons was 96.3%
This release, the percent precise polygons is also 96.3%
- Here is how we are tracking this metric across releases: Centroid-Radius Polygon Tracking.
- See here for a short list of POI categories which we do not require precise polygons
- The July 2021 backfill is here! There is a special "July 2021 Backfill Release Notes" doc just for more in-depth news and guidance around the backfill.
- This backfill restates foot traffic activity from January 1st 2018 - present for Weekly Patterns, Monthly Patterns, and Neighborhood Patterns. 💥
- By popular demand, we have simplified the way we calculate
Going forward, the value shown will be a simple percentage representing: overlapping visitors to both the brand and the applicable POI divided by visitors to the POI. The mapping will be limited to the top 20 brands. Previously, we were showing a mapping from a brand name to an index. The index was supposed to represent how strongly related the POI and the other brands are. Brands that are generally popular were “penalized” in calculating this index, but this was overly complicated. The format will remain the same.
related_same_month_brand(Monthly Patterns) and
related_same_week_brand(Weekly Patterns) will be modified with the same change as with
related_same_day_brandbut applied to a monthly or weekly time period as applicable. The format will remain the same.
- We are adding a new column called
visitor_home_aggregation🎉. This column will work just like
visitor_home_cbgsexcept it will show device origin based on a larger census geography than in the
visitor_home_cbgscolumn. Having this higher level aggregation will enable users to see home origins that might otherwise be missed since there was only one device from a given census block group but many from the higher level aggregation.
- For the U.S. 🇺🇸, the larger geography will be census tracts. Census tracts have a population of 1,200-8,000 versus population of 600-3,000 for a census block group.
- For Canada 🇨🇦, the larger geography will be aggregate dissemination areas. Aggregate dissemination areas have a minimum population of 5,000 versus a minimum population of 400 for dissemination areas.
visitor_home_aggregationhas an 88.7% fill rate, similar to
- The average row has 16.8 census tracts / aggregate dissemination areas represented by visitors, compared to 20.1 census block groups / dissemination areas.
- Home panel summary: We have added a new column,
home_panel_summary.csv. This will allow users to normalize the
- Finally, we are making some changes to columns now that the Canada release 🍁 is fully incorporated into Weekly Patterns:
- Neighborhood Patterns columns will now include Canadian source cbgs (i.e., with a “CA:” prefix), similar to what happened for Weekly Patterns in the May Release. The rows in Neighborhood Patterns will still only be U.S. only.
- For Weekly Patterns supplemental files,
statecolumn is renamed to
visit_panel_summary.csv, consistent with elsewhere in SafeGraph data.
- Similarly, in Weekly Patterns supplemental files,
iso_country_codeis being moved from the furthest right in the file to the right of
region. This will occur in
- See Column Ordering in our docs for the latest columns in the schema.
- In last month's delivery, SG Patterns had 4,511,670 points-of-interest (US only). This month, SG Patterns has 4,534,432 points-of-interest (US only) (net 22,762). 📈
- Last month, SG Patterns had 1,031,959,744 visits from 35,337,908 visitors (US only). This month, SG Patterns has 1,030,120,272 visits from 39,848,724 visitors (US only) (delta -1,839,444 visits, 4,510,817 visitors).
- In our Neighborhood Patterns product, where you can see more generalized foot traffic flows, we have:
- 2,112,160,839 raw stops ( -93,285,293 from last month)
- 454,541,272 raw devices ( 449,973 from last month)
Interested in global POI coverage? Reach out to your customer success manager to learn more about how we're thinking about growing coverage internationally. 🌎 **In case you missed it,** check out [last month's release notes](https://docs.safegraph.com/changelog/june-2021-release-notes). 📝 **Calculating Diffs** Curious to find the specific records that were either **added, deleted, or saw an attribute change** from one release to the next? Visit "Calculating Diffs" in our [Data Science Resources](https://docs.safegraph.com/docs/data-science-resources#section-calculating-diffs) to get started. **Fill Rates** See the [Summary Statistics](https://docs.safegraph.com/docs/places-summary-statistics) page for all Core and Geometry column fill rates as well as a breakdown of POI count by `naics_code`. **Explore** Browse SafeGraph Core & Geometry data at your own pace [in these webmaps.](https://storymaps.arcgis.com/stories/8e5e066486f94f0ea698e507d46987f7) **Also check out these new ways to get SafeGraph data: ** * Need data on the fly? [Try our Places API](https://shop.safegraph.com/api)! * Need some extra data or other SafeGraph products? Check out the [SafeGraph Data Bar.](https://shop.safegraph.com/) * Heavy AWS User? Check out our [listings in the AWS Data Exchange](https://aws.amazon.com/marketplace/search/results?filters=vendor_id&vendor_id=7d5ff8ca-105f-4856-9d99-5f2f1d83223c). * Snowflake user? Check out our page on the [Snowflake Data Exchange](https://www.snowflake.com/datasets/safegraph/) :snowflake: * Or just drop us a line! Your data needs are our data delights!