Match Service is a FREE π₯³ offering from SafeGraph to enable you to:
- deduplicate your own dataset
- match two unrelated POI datasets (just get placekeys for both sets and join) π‘
- enrich your data with SafeGraph's data (we live for clean π§Όdata)
How It Works
- Swing by our Data Bar πΈ and select Match from the menu.
Upload your CSV. Please make sure your CSV has headers β .
β οΈ If you have a file format other than CSV or your file is larger than 3 MB, contact us and we'll help you out πββοΈ.
Once you have uploaded your file, we will try to match your header names to ours auto-magically π§ββοΈ. If our magic is having an off day (hey, even Dumbledore wasn't perfect), you can drag and drop the headers to fix any mistakes.
We will match your data to ours and show you a preview πΆ.
You can download your matched data with placekeys appended for free. You can also purchase our additional POI attributes (including polygons, visitation info, or good ol' normalized addresses π¬) and we will append those columns to your file.
Currently, the name and address features are doing a lot of the heavy lifting ποΈββοΈ in our match algorithm. So, if your file is missing both of those features, the match rate will likely not be very high π’. We are working on improving this and appreciate any feedback you have.
Match File Column Ordering
- The matched file returns all customer submitted columns in the same order as uploaded but with "customer_" appended to the front of each column name. Following the customer submitted fields are a few SafeGraph metadata columns indicative of the match results:
Column Name | Description | Type | Example |
---|---|---|---|
other_match_candidates |
A list of other possible match candidates given input information. | List | sg:64d0ee4695af4ab4906fe82997ead9fx, sg:64d0ee4695af4ab4906fe82997ead9fz |
number_of_candidate_matches |
The number of all possible matches given input information including the primary match provided. | Integer | 3 |
warnings |
The deficiency causing a match failure followed by the number of occurrences. Possible values are the following: INSUFFICIENT_MATCHING_FIELDS, MISSING_STREET_ADDRESS, MISSING_NAME, MISSING_POSTAL_CODE, MISSING_CITY, MISSING_STATE | JSON {String: Integer} | "MISSING_CITY": 12, "MISSING_STATE":9 |
is_closed |
Indicates if the matching POI is flagged as closed. If TRUE and core is selected, only core columns are populated. If TRUE and core is not selected, the "safegraph_place_id" column will read "closed_poi" and the rest of the columns are null. If FALSE, the POI is open and all columns are populated as expected. For more information on closed POIs, please reference the Places Manual | Boolean | FALSE |
- The column order following
placekey
(all match files includeplacekey
) will vary based on the products selected. For Core, Geometry, or Patterns stand-alone downloads, please reference the column ordering in the Places Schema. For multiple product combinations, please reference the Column Ordering section in the Places Manual.
Match File Stats
- A summary statistics table [stats.csv] is also returned for each matched file and reports the following metrics:
- total number of rows
- total number of matches
- match rate
- number of matches with more than one match candidate
- percentage of matches with more than one match candidate
- number of duplicate entries in the original file
- fill rate per column
- warning counts
Match File Names
Below is a breakdown of matched file names per product combination downloaded from the SafeGraph match service:
- SGPID only: sgpid.csv
- Core: core_poi-sgpid.csv
- Core + Geometry: core_poi-geometry-sgpid.csv
- Core + Geometry + Patterns: core_poi-geometry-patterns-sgpid.csv
- Geometry: geometry-sgpid.csv
- Patterns: patterns-sgpid.csv
- Core + Patterns: core_poi-patterns-sgpid.csv
- Geometry + Patterns: geometry-patterns-sgpid.csv
Updated 3 days ago