Welcome 👋

Whether you’re a global enterprise, startup, or academic, learn how SafeGraph can improve your data science models.

Docs    Places API

Core Places

SafeGraph’s Core Places data provides baseline information for every record in the product suite, including location name, address, lat/long, category, brand, and more! With POI - POI - A point of interest is a specific physical location which someone may find interesting. Restaurants, retail stores, and grocery stores are all examples of points of interest. data for the US, Canada, and Great Britain, you can gain insights about any location that a person can visit aside from private residences.

Contents:

Core Places Schema

[core_poi.csv]

Base information available for ~9.9MM POIs in the US, Canada, and Great Britain. Option to include permanently closed POIs.

SafeGraph updates the Places dataset every month with the past month's openings and closings and maintains a persistent placekey across releases.

All SafeGraph Places datasets are formatted as delimited CSVs, and can be purchased independently or together. Please reference the SafeGraph Attribute Matrix for a breakdown of which columns are available per product, and reference Column Ordering for specific column orderings per product combination. Otherwise, stand-alone purchases will follow the column ordering listed in the schemas below:

Column Name Description Type Example
placekey Unique and persistent ID tied to this POI. See Placekey for details on placekey - placekey - Unique and persistent ID tied to this POI. See the Placekey Key Concept for details on placekey design. design. String [email protected]
parent_placekey If place is encompassed by a larger place (e.g. mall, airport), this lists the placekey - placekey - Unique and persistent ID tied to this POI. See the Placekey Key Concept for details on placekey design. of the parent place; otherwise null. See more on parent-child relationships in Spatial Hierarchy. String [email protected]
location_name The name of the place of interest. String Salinas Valley Ford Lincoln
safegraph_brand_ids Unique and consistent ID that represents this specific brand. List SG_BRAND_59dcabd7cd2395a2, SG_BRAND_8310c2e3461b8b5a
brands If this POI is an instance of a larger brand that we have explicitly identified, this column will contain that brand name. See more details in brands. List Ford, Lincoln
store_id The unique ID associated with the store as provided and maintained by the store/brand itself. Please contact sales to learn more about licensing this premium column. String 36558
top_category The label associated with the first 4 digits of the POI’s NAICS category. String Automobile Dealers
sub_category The label associated with all 6 digits of the POI’s NAICS category. For POIs with a 4-digit NAICS - NAICS - The North American Industry Classification System is a classification of business establishments by type of economic activity. category, this column is null String New Car Dealers
naics_code 4-digit or 6-digit NAICS code describing the business. Integer 441110
latitude Latitude coordinate of the place of interest. Float 36.714767
longitude Longitude coordinate of the place of interest. Float -121.662912
street_address Street address of the place of interest. String 1100 Auto Center Circle
city The city of the point of interest. String Irvine
region The state, province or county of the place of interest. See region for more details. String CA
postal_code The postal code of the place of interest. String 92602
iso_country_code The 2 letter ISO 3166-1 alpha-2 country code. Expected values are US, CA, and GB. String US
phone_number The phone number of this POI String +14151234567
open_hours A JSON string with days as keys and opening & closing times (in the POI's local time) as values. See open_hours for more details. String { "Mon": [["8:00", "22:00"]], "Tue": [["8:00", "13:00"], ["18:00", "24:00"]], "Wed": [["0:00", "2:00"]], "Thu": [["0:00", "24:00"]], "Fri": [["23:00", "24:00"]], "Sat": [["0:00", "3:00"], ["15:00", "22:30"]], "Sun": [] }
category_tags For POI with naics_code starting 722, we provide an array of descriptive tags indicating higher-resolution category information. See category_tags for more details. List [Mexican Food,Casual Dining,Lunch,Dinner]
opened_on The outside year and month this POI opened in yyyy-mm format. If null, then we do not have enough metadata to determine an open date. See the open_on logic for more details. String 2019-10
closed_on The outside year and month this POI closed in yyyy-mm format. If null, then this POI is open. See the closed_on logic for more details. String 2020-03
tracking_closed_since Indicates the year and month we started tracking "closed_on" for this POI. See the closed_on logic for more details. String 2019-07
geometry_type The geometric shape associated with this POI. Possible values are: 1) POLYGON: POI has a polygon and geometry metadata available. 2) POINT: POI does not have a polygon and does not have Geometry metadata. See geometry_type for more details. String POINT

store_id is a premium column. Please Contact Sales for more details.

Brand Info

[brand_info.csv]

A SafeGraph brand is defined as a logo or branded store which has multiple locations all under the same logo or store banner. For a deep dive on how we think about brands - brands - If this POI is an instance of a larger brand that we have explicitly identified, this column will contain that brand name. This is an easy way to, for example, unambiguously select all Target stores in the USA. A POI may have multiple brands, as in a new car dealership that sells ford and lincoln cars. , see our November 2018 Release Notes.

The brand_info file is a separate csv that is complimentary with a core places purchase. See brands for more details.

Column Name Description Type Example
safegraph_brand_id Unique and persistent ID that represents this specific brand - brand - A logo or branded store which has multiple locations all under the same logo or store banner. . String SG_BRAND_59dcabd7cd2395a2
brand_name This is the brand_name corresponding to the safegraph_brand_id. String Ford Motor Company
parent_safegraph_brand_id There are 2 possible values: 1) If this brand - brand - A logo or branded store which has multiple locations all under the same logo or store banner. has a parent, this will list the ID of the parent brand. 2) If this brand has no parent, this will be null. String SG_BRAND_8310c2e3461b8b5a
naics_code 4-digit or 6-digit NAICS code describing the business. Integer 441110
top_category The label associated with the first 4 digits of the POI’s NAICS category. String Automobile Dealers
subcategory TThe label associated with all 6 digits of the POI’s NAICS category. For POIs with a 4-digit NAICS - NAICS - The North American Industry Classification System is a classification of business establishments by type of economic activity. category, this column is null String New Car Dealers
stock_symbol The stock ticker (if the corporation is traded publicly) String F
stock_exchange The stock exchange on which this corporation is listed (if the corporation is traded publicly). String NYSE
iso_country_codes_open A list of all 2 letter ISO 3166-1 alpha-2 country codes for each country this brand has at least 1 open POI (closed_on is null). String ["US", "GB"]
iso_country_codes_closed A list of all 2 letter ISO 3166-1 alpha-2 country codes for each country this brand - brand - A logo or branded store which has multiple locations all under the same logo or store banner. has at least 1 closed POI (closed_on is not null). String ["US", "CA"]

Key Concepts

Places Scope

Core Places provides baseline information for every record in the SafeGraph product suite. The current scope of a place is defined as any location humans can visit with the exception of single-family homes, multi-family homes, and apartments. This definition encompasses a broad swath of places ranging from restaurants, grocery stores, and malls; to parks, hospitals, and museums, offices, and industrial parks.

SafeGraph Core Places and Geometry is currently offered in the US, Canada, and Great Britain. 🇺🇸 🇨🇦 🇬🇧

Note that Great Britain coverage is limited to mainland Great Britain (England, Scotland, and Wales) and does not include the whole of Great Britain (does not include Northern Ireland nor the Isles). For Game of Thrones fans, this is to say that we cover Westeros but not Esos nor the Iron Islands 👑 .

placekey

Placekey is a unique and persistent identifier for any physical place in the world that intelligently partitions the ID into meaningful encodings. So how does Placekey work?

‍When both parts of a placekey - placekey - Unique and persistent ID tied to this POI. See the Placekey Key Concept for details on placekey design. come together, the final result reads as [email protected] This is a unique way of shedding light on both the descriptive element of a place as well as its geospatial position in the physical world via a single identifier.

What: Address Encoding
The first three characters refer to the Address Encoding, creating a unique identifier for a given address. An address at “555 Main Street Suite 105” will have a different Address Encoding than “555 Main Street Suite 106.” However, "444 Second Street, Suite 4" will have the same address encoding as "444 2nd St. #4" to adjust for common address formats.

What: POI Encoding
The second set of three characters in the 'What Part' refers to the POI Encoding. If a specific place has a location name (like "Central Park") and is already included in the Placekey reference datasets, these characters will be present. The benefit of the POI Encoding is that it can point to a specific point of interest that may have existed at a certain address at a given point in time.

Where: H3 Encoding
The 'Where Part,' on the other hand, is made up of three unique character sequences, built upon Uber’s open source H3 grid system. This information in the 'Where Part' is based on the centroid of that place. In other words, we take the latitude and longitude of a specific place and then use a conversion function to determine a hexagon in the physical world, representing about 15,000 sq. meters, containing the centroid of that place. The 'Where Part' of the Placekey is, therefore, the full encoding of that hexagon.

Open access to your own datasets using the FREE Placekey API.

Note: that Placekeys outside of the US, Canada, Great Britain, and the Netherlands are not yet supported via the Placekey API, and any Placekeys outside of these locations will have zz prepended to the POI encoding portion of the Placekey. These Placekeys will still serve as unique and persistent keys for these POIs.

Brands

SafeGraph curates over 7,700 distinct brands - brands - If this POI is an instance of a larger brand that we have explicitly identified, this column will contain that brand name. This is an easy way to, for example, unambiguously select all Target stores in the USA. A POI may have multiple brands, as in a new car dealership that sells ford and lincoln cars. and growing. These are chains of commercial POIs that include all major brands in the United States, Canada, and Great Britain (McDonald's, AMC, Macy's, Chevrolet, Whole Foods Market, etc.).

Note that ~80% of POIs have no brand - brand - A logo or branded store which has multiple locations all under the same logo or store banner. associated as they are single commercial locations (local restaurants, museums, etc.). SafeGraph is continually improving the fill rate of brands with each release - please contact us if you notice a brand missing.

Some POIs include multiple brands, for example, a car dealership may sell multiple car brands or branded POI may be co-located, such as some Taco Bell and KFC stores, or IMAX and AMC cinemas. In these cases the brands and brand_ids are listed as an array that is alphabetized by brand name and the order does not specify any importance.

Brands provide an easy way to isolate only major stores. If you know you are searching for a brand that we cover, we advise searching the brand column instead of the name column. Even better is to search the brand_info file and build your workflows around safegraph_brand_id.

Every place has a name but only POIs belonging to a chain will have a brand. In certain cases, name and brand will be the same but in other cases, these fields may be different. For example, if you’re searching for all branded McDonald’s (fast-food) stores, you would search for all POI entries where brands = ‘mcdonalds’. However, if you search for all POI where name = 'mcdonalds' you could incorrectly return non-branded stores that happen to share the same name.

If you are having difficulty matching location or brand names to listings of POI that you have, we offer a matching service that will provide you with the placekeys of locations mapped to your existing POI data.

Categorization of POI

SafeGraph Places uses the North American Industry Classification System (NAICS) developed by the US Census Bureau, which consists of a numeric NAICS - NAICS - The North American Industry Classification System is a classification of business establishments by type of economic activity. code up to 6 digits in length. Although this taxonomy was developed in the US, we have found it just as useful for categorizing POIs in other countries as well and will continue to use it until a better alternative presents itself.

The code itself is hierarchical; in other words, the first 2 digits describe a very general category, and additional digits describe more and more specific categories. For example:

  • 72 is the general category Accommodation and Food Services.
  • 722 is the more specific category Food Services and Drinking Places.
  • 7225 is the even more specific category Restaurants and Other Eating Places.
  • 722513 is the most specific category Limited-Service Restaurants (i.e. quick-serve or fast-food restaurants).

We strive to assign a best fitting naics_code for all of our POIs, but occasionally, our category algorithm is unable to infer a high confidence naics_code based on POI name and other descriptive metadata. In these cases, naics_code is null. Over 99% of SafeGraph POIs have a naics_code - see our summary statistics page for the latest details.

See POI Types for a list of naics_code and associated safegraph_subcategory for common POI types.

In Great Britain, ~85% of POIs have a 4 digit naics_code. We are very confident in our ability to predict 4 digit naics_codes, and we chose to sacrifice the extra digits of precision in exchange for high veracity predictions, and also because the extra precision is often not meaningful. If 6 digit naics_codes are critical to your workflow, please let us know.

Determining when POI Open and Close

opened_on and closed_on dates are determined from metadata at the source level. If a new POI from an existing source repeatedly appears in our build pipeline, it is flagged as opened_on during the month in which it first appears. Similarly, if a POI from an existing source repeatedly disappears in our build pipeline, it is flagged as closed_on during the month in which it first disappears. These flags are added to the Places product permitting final QA checks and overall data hygiene.

Temporary closures are not captured in open/close tracking, and it became difficult to distinguish permanent closures from temporary closures at the onset of COVID-19. This resulted in a relatively low count of POIs with closed_on values between "2020-03" and "2020-06" as we erred towards the side of caution to not mistakenly mark temporarily closed businesses as permanently closed.

If a POI has not yet been sourced consistently enough to provide the metadata needed to determine closed_on dates, then it will have a null value in the tracking_closed_since column. In general, the SafeGraph Places product tracks opened_on and closed_on dates from as early as 2019-07 onward, and therefore, the majority of POIs that have a tracking_closed_since date will show a value of "2019-07."

Please note that closed_on values are over-indexed on "2020-01" as January 2020 was the first Places release featuring the open/close columns . At this time, only branded POIs (POIs with a safegraph_brand_id) contained enough metadata to determine a true store closure during that month. Non-branded POIs with a "2020-01" closed_on value implies that the POI closed sometime before January 2020, but we do not have enough metadata history to determine the exact yyyy-mm. All other closed_on values are precise within a < 60 day margin of error.

The opened_on, closed_on and tracking_closed_since columns are specific to Core Places. These are not available in stand-alone Geometry or Patterns purchases. If Core is purchased in combination with Geometry and/or Patterns, the Geometry and Patterns specific fields will be null for any POIs with a closed_on date. Please reference Column Ordering for details on where these columns exist per product combination.

Column Name Detailed Descriptions

placekey

Placekey is a unique and persistent identifier for any physical place in the US that intelligently partitions the ID into meaningful encodings. See the Placekey key concept for a detailed description.

top_category, sub_category, naics_code

top_category and sub_category are the string labels associated with the first 4 digits and 6 digits of naics_code, respectively. See [core-places#poi-categorization]

latitude , longitude

  • In general, latitude and longitude are defined by our best knowledge of the POI location. It is not designed to specifically locate the front door of the business, but rather defines the general center of the business.
  • Latitude and longitude still attempt to identify the individual business even if that business and others have the same polygon (e.g. strip mall).

street_address

  • We implement a number of steps to clean, validate and standardize street_address.
  • You should expect street_address to be title-cased, consistent, and friendly for human reading. Please send us your feedback if you see otherwise.
  • If you care about street addresses as much as we do, we also have more specific address columns to split out address components. These are optional and available upon request for future deliveries.
    • primary_number
    • street_predirection
    • street_name
    • street_postdirection
    • street_suffix

city

  • In the US, all centroids (latitudes/longitudes) are referenced against a geospatial file of city boundaries as defined by the US Census Bureau (browse the boundaries here). In edge cases, the preferred city name in the address line reflects a pre-annexed city name, and we try our best to preserve those city names where possible.
  • In Canada, city names are the output of normalized address strings from POI sources.
  • In Great Britain, city names are the output of normalized address strings from POI sources, but in edge cases, we allow POIs to have a null city name as long as region is populated. The region column in Great Britain refers to county boundaries, and counties are a decent alternative to cities for geographic filtering.

region

  • When iso_country_code == US, then this is the US state or territory.
  • When iso_country_code == CA, then this is the Canadian Province or territory.
  • When iso_country_code == GB, then this is the United Kingdom county.

postal_code

phone_number

This is a 10 digit phone number in the US and Canada or a 12 digit phone number in Great Britain. We filter out toll-free numbers (e.g. 1-800) and strive to have POI-specific numbers (not franchise-level or corporate-level numbers).

open_hours

The new format for open hours is a JSON string with days as keys and opening & closing times (in the POI's local time) as values.

  • Each JSON string is guaranteed to have all 7 days as keys
  • We indicate that a POI is closed for the day by giving it a value of "[]"
  • We indicate that a POI is open the entire day by using a format like: `
    • "Thu": [["0:00", "24:00"]]`
  • For POI that open and close multiple times throughout the day (e.g. a restaurant open in the morning and evening but not midday), we list multiple opening/closing pairs. For example:
    • “Sat": [["8:00", "13:00"], ["15:00", "22:30"]]
    • This indicates that a POI is open from 8 am to 1 pm and also from 3 pm to 10:30 pm on Saturday.
  • For POI that open and close on different days (e.g. a bar which opens on Tuesday at 6 pm and closes on Wednesday at 2 am), we use a format like:
    • "Tue": [["18:00", "24:00"]], "Wed": [["0:00", "2:00"]]

category_tags

  • category_tags provide higher-granularity category information and are currently available for POI belonging to NAICS - NAICS - The North American Industry Classification System is a classification of business establishments by type of economic activity. "Food Services and Drinking Places" (i.e. the first three digits of the naics_code is 722).
  • Category information is conveyed is a list of descriptive words about the POI. e.g ['Mexican Food, Dinner]
  • Here is the full list of possible tags. There is no constraint on how many descriptors (tags) a POI can have; SafeGraph strives to label all relevant tags for every POI.

geometry_type

This is the geometric shape associated with the POI where possible values are: "POLYGON" or "POINT." This is meant to distinguish traditional SafeGraph places which have Geometry ("POLYGON") from places that intentionally do not have Geometry ("POINT.") These places are mostly comprised of transit stops, major ATM brands, kiosks, and electric vehicle charging stations and will not have Patterns data attributed to them. We are working to expand geographic coverage in each of these categories over the next few months.

store_id

Store_id is the unique ID associated with a store as provided and maintained by the store/brand itself. This is a premium column only applicable to places with a safegraph_brand_id. Most store_ids can be found directly on store locators, but in some cases, the store_id is embedded within the store locator URL for the specific store. Note that there is no single source of truth for store_ids, and some first party datasets may not define store_id in the same manner which SafeGraph does, but we strive to provide the most widely used concept of store_id.

Typically, store_ids are alphanumeric codes unique to each location. However, they are not always alphanumeric. For example the store number for this store is 1615.

store_id is especially useful as join key when working with transaction data. For example, “TJ256Y8” may be the only location specific information within a transaction dataset. A Places dataset which also contains "TJ256Y8" as a store_id enables a join to contextualize transaction data (or other internal, store-level data) with SafeGraph places information.

Column Ordering

Reference this sheet for specific column orders when licensing various product combinations.

Known Data Issues or Artifacts

closed_on First Featured

Please note that closed_on values are over-indexed on "2020-01" as January 2020 was the first Places release featuring the open/close columns. At this time, only branded POIs (POIs with a safegraph_brand_id) contained enough metadata to determine a true store closure during that month. Non-branded POIs with a "2020-01" closed_on value implies that the POI closed sometime before January 2020, but we do not have enough metadata history to determine the exact yyyy-mm. All other closed_on values are precise within a < 60 day margin of error.

naics_code Consolidation

As of the August-2020 Release Notes, we consolidated a handful of 6 digit naics_codes into 4 digit naics_codes in the US and Canada in cases where the 6 digit naics_code is too obscure to distinguish from an adjacent 6 digit naics_codes in the same family. In these cases, the “sub_category” column is null.

  • An example of 6 digit naics_codes we consolidated are “Advertising Agencies (541810) and “Other Services Related to Advertising” (541890). Instead of maintaining these as separate 6 digit naics_codes that are not meaningfully differentiated, we assigned all POIs fitting either description to the 4 digit naics_code “Advertising, Public Relations, and Related Services” (5418).

A grand total of twenty 6 digit naics_codes consolidated to nine 4-digit naics_codes. It’s important to note that just because a 4 digit naics_code exists in our model does not guarantee it will show up in our data. For example, it is possible that zero POIs are assigned to naics_code = 3169 (Other Leather and Allied Product Manufacturing). However, 3169 is a possible naics_code value if a POI fitting that description exists.

A complete mapping of the 6 digit to 4 digit changes can be found below:

New naics_code (top_category) Old naics_code (sub_category)
3169 (Other Leather and Allied Product Manufacturing) 316992 (Women's Handbag and Purse Manufacturing), 316998 (All Other Leather Good and Allied Product Manufacturing)
3231 (Printing and Related Support Activities) 323111 (Commercial Printing (except Screen and Books)), 323113 (Commercial Screen Printing), 323117 (Books Printing)
3369 (Other Transportation Equipment Manufacturing) 336991 (Motorcycle, Bicycle, and Parts Manufacturing), 336999 (All Other Transportation Equipment Manufacturing)
3399 (Other Miscellaneous Manufacturing) 339910 (Jewelry and Silverware Manufacturing)
5324 (Commercial and Industrial Machinery and Equipment Rental and Leasing) 532412 (Construction, Mining, and Forestry Machinery and Equipment Rental and Leasing), 532490 (Other Commercial and Industrial Machinery and Equipment Rental and Leasing)
5416 (Management, Scientific, and Technical Consulting Services) 541611 (Administrative Management and General Management Consulting Services), 541613 (Marketing Consulting Services), 541618 (Other Management Consulting Services), 541690 (Other Scientific and Technical Consulting Services)
5418 (Advertising, Public Relations, and Related Services) 541810 (Advertising Agencies), 541890 (Other Services Related to Advertising)
6233 (Continuing Care Retirement Communities and Assisted Living Facilities for the Elderly) 623311 (Continuing Care Retirement Communities), 623312 (Assisted Living Facilities for the Elderly)
7111 (Performing Arts Companies) 711110 (Theater Companies and Dinner Theaters), 711130 (Musical Groups and Artists), 711130 (Musical Groups and Artists)

Additionally, a total of four 6 digit naics_codes migrated to different 6 digit naics_codes. See below for those changes:

New naics_code (top_category) Old naics_code (sub_category)
447110 (Gasoline Stations with Convenience Stores) 447190 (Other Gasoline Stations)
488190 (Other Support Activities for Air Transportation) 488119 (Other Airport Operations)
493110 (General Warehousing and Storage) 493190 (Other Warehousing and Storage)
611519 (Other Technical and Trade Schools) 611512 (Flight Training)

safegraph_place_id, parent_safegraph_place_id, tracking_opened_since columns dropped

As of July-2021 Release Notes, `safegraph_place_id and parent_safegraph_place_id were dropped and placekey and parent_placekey are referenced moving forward. tracking_opened_since was dropped, due to the column being redundant. If a POI has an opened_on value, it implies we've been tracking it since that date. If a POI does not have an opened_on value, it implies we were not able to track the exact date it opened.

Updated 6 days ago


What's Next

Geometry
Patterns

Core Places


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.