Using SafeGraph Patterns With Snowflake + Tableau
Legacy Product
This page references SafeGraph Patterns, Weekly Patterns, and/or Neighborhood Patterns, legacy products that will no longer be available at the start of 2023. If you are interested in foot traffic data, please contact us and we can refer you to a mobility data partner.
SafeGraph Core Places, Patterns, and Open Census Data data are available in Snowflake!
This page provides a simple demo to get you started with working with these datasets in Snowflake. We'll be using the Starbucks sample in Snowflake for this demo. These are free listings on Snowflake's Data Marketplace that represent foot traffic for U.S. Starbucks locations
Snowflake Tutorial: Exploding JSON Columns in SafeGraph data
SafeGraph Core and Patterns datasets contain JSON columns. JSON columns are rich in information, but they need to be exploded to play well in SQL. We will explore how to explode these JSON columns, so that each key-value pair becomes its own row. For more information on handling JSON data, see "How do I work with the Patterns columns that contain JSON?" .
One example of a JSON column is popularity_by_hour
. This column breaks down visitations by hour for a given POI in a given month. In the JSON, the 'value' is 'visits' but it has no 'key'. We therefore must use the index to indicate the 'hour'.
The below SQL snippet uses Starbucks Patterns: we will explode the popularity_by_hour
column for two POI, which we identify and filter by their placekeys (i.e. unique identifier).
Query 1
WITH patterns AS (
SELECT *
FROM STARBUCKS_PATTERNS_SAMPLE.PUBLIC.PATTERNS
WHERE placekey in ('222-224@5z5-3t2-3wk','zzw-223@5z4-rcm-4gk')
),
exploded AS (
SELECT
patterns.placekey
,patterns.location_name
,patterns.city
,expl.INDEX as hour
,expl.VALUE as visits
FROM
patterns,
TABLE(FLATTEN(input => PARSE_JSON(patterns.popularity_by_hour))) as expl
)
SELECT *
FROM exploded
The result set will be 48 rows, representing one row per given POI per given month per hour of the day (2 stores x 1 month x 24 hours).
Demo: Comparison of trade areas for Starbucks locations in DC -- enriched with US Census income data.
For individual store locations, trade area analyses examines the geographical area in which the location transacts business, or its "commercial territory". We can use SafeGraph Patterns data to measure trade area based on real world mobility data.
In contrast to typical methods for measuring trade area based on distance or travel time (isochrone), using Patterns data in this way shows what areas are actually being serviced by a specific store. Mobility data in SafeGraph is aggregated to the Census Block Group (CBG) level, so we are able to join in CBG-level data from the US Census (e.g. median household income).
In this example, we will pull SafeGraph data from Snowflake (using the Starbucks Sample data) and then connect Tableau. First, we will run two queries in Snowflake. Then, we will connect these two queries in Tableau (as views, instead of custom SQL) . Finally, we will join in a .hyper file for the Census Block Group geometries for visualization.
Query 2
CREATE view visits_by_hour AS (
WITH patterns AS (
SELECT *
FROM STARBUCKS_PATTERNS_SAMPLE.PUBLIC.PATTERNS
WHERE region = 'DC'
),
exploded AS (
SELECT
patterns.placekey
,patterns.location_name
,patterns.city
,expl.INDEX as hour
,AS_INTEGER(expl.VALUE) as visits
FROM
patterns,
TABLE(FLATTEN(input => PARSE_JSON(patterns.popularity_by_hour))) as expl
),
core_lat_lon as (
SELECT
STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.PLACEKEY
,STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.LATITUDE
,STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.LONGITUDE
FROM STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI
),
SELECT
exploded.*
,core_lat_lon.LATITUDE
,core_lat_lon.LONGITUDE
FROM exploded
INNER JOIN core_lat_lon
ON exploded.placekey = core_lat_lon.placekey
);
Query 3
CREATE view cbg_visits_income AS (
WITH exploded_census_income AS (
WITH exploded AS (
SELECT
patterns.placekey
,patterns.location_name
,patterns.city
,patterns.region
,expl.KEY as visitor_home_cbg
,expl.VALUE as visitor_count
FROM
(SELECT * FROM STARBUCKS_PATTERNS_SAMPLE.PUBLIC.PATTERNS) as patterns,
TABLE(flatten(input => PARSE_JSON(patterns.VISITOR_HOME_CBGS))) as expl
WHERE region = 'DC'
),
SELECT exploded.*
,B19013e1 as hh_median_income
FROM exploded
JOIN CBG.PUBLIC.DATA_CBG_B19 as cbg
ON exploded.visitor_home_cbg = cbg.census_block_group
),
core_lat_lon AS (
SELECT
STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.PLACEKEY
,STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.LATITUDE
,STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.LONGITUDE
FROM STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI
)
SELECT
exploded_census_income.*
,core_lat_lon.LATITUDE
,core_lat_lon.LONGITUDE
FROM exploded_census_income
INNER JOIN core_lat_lon
ON exploded_census_income.placekey = core_lat_lon.placekey
);
Visualizing in Tableau
After we create these views, we can connect to them in Tableau.
-
For the
visits_by_hour
view, we can connect directly with no further steps. -
For the
cbg_visits_income
view , once connected we must join geometry (the .hyper file) to view on a map. Note - You may need to use a Tableau join calc to cast the Snowflake CBG column as a string.
Your data sets are now ready to build a demo. Here is a sample workbook.
Updated about 2 years ago