Using SafeGraph Patterns With Snowflake + Tableau

SafeGraph Core Places, Patterns, and Open Census Data data are available in Snowflake!

This page provides a simple demo to get you started with working with these datasets in Snowflake. We'll be using the Starbucks sample in Snowflake for this demo. These are free listings on Snowflake's Data Marketplace that represent foot traffic for U.S. Starbucks locations

Snowflake Tutorial: Exploding JSON Columns in SafeGraph data

SafeGraph Core and Patterns datasets contain JSON columns. JSON columns are rich in information, but they need to be exploded to play well in SQL. We will explore how to explode these JSON columns, so that each key-value pair becomes its own row. For more information on handling JSON data, see "How do I work with the Patterns columns that contain JSON?" .

One example of a JSON column is popularity_by_hour . This column breaks down visitations by hour for a given POI in a given month. In the JSON, the 'value' is 'visits' but it has no 'key'. We therefore must use the index to indicate the 'hour'.

The below SQL snippet uses Starbucks Patterns: we will explode the popularity_by_hour column for two POI, which we identify and filter by their placekeys (i.e. unique identifier).

Query 1

WITH patterns AS (
   SELECT *
   FROM STARBUCKS_PATTERNS_SAMPLE.PUBLIC.PATTERNS
   WHERE placekey in ('[email protected]','[email protected]')
 ),
 exploded AS (
   SELECT
     patterns.placekey
     ,patterns.location_name
     ,patterns.city
     ,expl.INDEX as hour
       ,expl.VALUE as visits
   FROM
     patterns,
     TABLE(FLATTEN(input => PARSE_JSON(patterns.popularity_by_hour))) as expl
 )
 SELECT *
 FROM exploded

The result set will be 48 rows, representing one row per given POI per given month per hour of the day (2 stores x 1 month x 24 hours).

Demo: Comparison of trade areas for Starbucks locations in DC -- enriched with US Census income data.

For individual store locations, trade area analyses examines the geographical area in which the location transacts business, or its "commercial territory". We can use SafeGraph Patterns data to measure trade area based on real world mobility data.

In contrast to typical methods for measuring trade area based on distance or travel time (isochrone), using Patterns data in this way shows what areas are actually being serviced by a specific store. Mobility data in SafeGraph is aggregated to the Census Block Group (CBG) level, so we are able to join in CBG-level data from the US Census (e.g. median household income).

In this example, we will pull SafeGraph data from Snowflake (using the Starbucks Sample data) and then connect Tableau. First, we will run two queries in Snowflake. Then, we will connect these two queries in Tableau (as views, instead of custom SQL) . Finally, we will join in a .hyper file for the Census Block Group geometries for visualization.

Query 2

CREATE view visits_by_hour AS (
   WITH patterns AS (
     SELECT *
     FROM STARBUCKS_PATTERNS_SAMPLE.PUBLIC.PATTERNS
     WHERE region = 'DC'
   ),
   exploded AS (
     SELECT
       patterns.placekey
       ,patterns.location_name
       ,patterns.city
       ,expl.INDEX as hour
       ,AS_INTEGER(expl.VALUE) as visits
     FROM
       patterns,
       TABLE(FLATTEN(input => PARSE_JSON(patterns.popularity_by_hour))) as expl
   ),
   core_lat_lon as (
     SELECT
       STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.PLACEKEY
       ,STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.LATITUDE
       ,STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.LONGITUDE
     FROM STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI
   ),
   SELECT 
     exploded.*
     ,core_lat_lon.LATITUDE
     ,core_lat_lon.LONGITUDE
   FROM exploded 
   INNER JOIN core_lat_lon 
   ON exploded.placekey = core_lat_lon.placekey
 );

Query 3

CREATE view cbg_visits_income AS (
  WITH exploded_census_income AS (
    WITH exploded AS (
      SELECT
        patterns.placekey
        ,patterns.location_name
        ,patterns.city
        ,patterns.region
        ,expl.KEY as visitor_home_cbg
        ,expl.VALUE as visitor_count
      FROM
        (SELECT * FROM STARBUCKS_PATTERNS_SAMPLE.PUBLIC.PATTERNS) as patterns,
        TABLE(flatten(input => PARSE_JSON(patterns.VISITOR_HOME_CBGS))) as expl
      WHERE region = 'DC'
    ),
    SELECT exploded.*
                 ,B19013e1 as hh_median_income
    FROM exploded
    JOIN CBG.PUBLIC.DATA_CBG_B19 as cbg
    ON exploded.visitor_home_cbg = cbg.census_block_group
  ),
  core_lat_lon AS (
    SELECT
      STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.PLACEKEY
      ,STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.LATITUDE
      ,STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI.LONGITUDE
    FROM STARBUCKS_CORE_PLACES.PUBLIC.CORE_POI
  )
  SELECT 
    exploded_census_income.*
    ,core_lat_lon.LATITUDE
    ,core_lat_lon.LONGITUDE
  FROM exploded_census_income 
  INNER JOIN core_lat_lon 
  ON exploded_census_income.placekey = core_lat_lon.placekey
);

Visualizing in Tableau

After we create these views, we can connect to them in Tableau.

  • For the visits_by_hour view, we can connect directly with no further steps.

  • For the cbg_visits_income view , once connected we must join geometry (the .hyper file) to view on a map. Note - You may need to use a Tableau join calc to cast the Snowflake CBG column as a string.

Your data sets are now ready to build a demo. Here is a sample workbook.


Did this page help you?