Known Issues or Data Artifacts

Known Data Issues or Artifacts

We strive for full transparency for any known data issues that may affect your analysis, not otherwise accounted for in monthly release notes. If you notice a problem that is not listed here, please send us your observations so we can investigate.

Year to Date Known Issues

Date Reported

Description

Discussion/Links

Resolved?

11/29/2021

There was an increase in the number devices in SafeGraph's Patterns panel starting roughly 11/16/2021, with the largest increases in WA state (+20%). In the raw, non-normalized data, this appears as an increase in visitors.

We recommend applying panel normalization to account for the changes in panel size if your application is sensitive to visits during this time.

Not necessary since this is normal behavior.

11/18/2021

A few parks in Seattle have highly anomalous visits in the July 2021 Backfill of Monthly and Weekly Patterns during a few historical weeks.

This was confirmed to be isolated to just a few POIs and not to be due to Geometry errors. See discussion in Community here.

No, and unlikely to be. Most likely the cause is a temporary sink of anonymized lat/long pings in these locations that slipped through QA. Our recommendation is to remove these outliers and impute the missing weeks.

9/14/2021

A bug caused visits to large Golf Courses and Country Clubs (naics_code=713910) and Amusement and Theme Parks (naics_code=713110) to be inflated between April and August 2021, inclusive, including data in the July 2021 backfill. Median dwell time was similarly lower than it should have been for the same months due to this bug.

Note that visitors were not affected, just visits.

This affected ~17k POIs in Patterns (0.3%).

The bug primarily affected "large" POIs, in this case POIs in the two categories over a certain square footage, although not all such POIs were impacted.

Visits and median dwell for these POIs were corrected for September 2021 data and onward.

9/7/2021

In Neighborhood Patterns from Jan 2018 to June 2021, there are 1-2 days per month where the stops_by_day column does not match the sum of the relevant elements in the stops_by_each_hour array.

A list of affected dates can be found here.

Yes, as of the July 2021 Neighborhood Patterns release. Historical data will be corrected in the next backfill for Neighborhood Patterns.

8/12/2021

There is a CBG in Manhattan around City Hall that indicates 10x as many devices in Neighborhood Patterns as neighboring CBGs

See this Community slack thread

No, and resolution will be unlikely owing to the fact that sources/sinks are sometimes inherent in GPS data. If this is affecting normalization, we recommend using normalizing using state values as opposed to CBG values.

8/6/2021

Close to 6000 POIs (~0.1%) have visits assigned in Patterns after closed_on dates .

See Relationship with opening and closing dates

This issue gets resolved with each backfill.

7/29/2021

Neighborhood Patterns Home Panel Summary files have a small number of rows corresponding to Canadian neighborhoods.

No, but when Canada Neighborhood Patterns gets released, the Home Panel Summary files will have many more rows for Canadian neighborhoods, so this behavior will become standard in the future.

7/7/2021

Quotation marks in iso_country_codes_openand iso_country_codes_closed columns in Brand Info file are not encoded properly.

Yes. This was resolved in the August 2021 release of Core.

7/6/2021

June 2021 version of Monthly Patterns appears to have large sinks of devices in a few CBGs, far greater than the population of those CBGs.

No, but resolution will be unlikely owing to the fact that sources/sinks are sometimes inherent in GPS data. If this is affecting normalization, we recommend using normalizing using state values as opposed to CBG values.

6/4/2021

Due to incorrect geometries, 6 U.S. POIs have a Canadian geocodes, leading to some odd behavior in Supplementary Files.

Will be as soon as fixes to these geometries get ingested.

3/17/2021

Processing error in Social Distancing Metrics on 3/8/2021 which resulted in an influx in devices on this day. This explains the sharp increase in devices seen and completely home devices on this date.

No

3/17/2021

Processing error in Weekly Patterns on 3/3 caused a decrease in visits. We backfilled the week of starting 3/1 to fix this.

Community Thread

Yes

1/12/2021

Certain columns in Neighborhood Patterns columns were lower than expected.

Community Thread

Yes, in the July 2021 Backfill.

Prior to 2021 Known Issues

Date Reported

Description

Discussion/Links

Resolved?

11/18/2020

In Social Distancing Metrics (and possibly other datasets) there are an abnormal number of records showing travel to/from parts of Kansas. This is likely due to a GPS data problem related to the the center of the country issue known to influence a very small minority of location data when non-GPS data is inadvertently mixed with GPS data.

See this summary of known unexpected data trends for 2/25

SafeGraph is always working to ensure the highest quality location data is used to build its products and we are always working to improve artifacts like this one.

8/30/2020

4/21/2019 (Easter) may be an anomalous day in Patterns data.

We had a supply issue at that time that seemed to have decreased the number of visits collected artificially.

Actively investigating. Workaround is to ignore data from this day.

7/7/2020

Several inexplicably abnormal days of data in 2018. Dates affected: 3/15/2018, 9/15/2018, 9/16/2018

Community Discussion

No fix in medium term. Short term workaround is to omit completely if possible. Otherwise, replace with median imputation or some other method so the days have no impact on analysis.

6/30/2020

opened_on column over-indexed on 2020-01

See opened_on documentation

4/13/2020

CBG FIPS are corrupted for some rows in Open Census Data file cbg_b22.csv

Unfortunately, there is no timeline for fixing this. Apologies for the inconvenience. However, our Slack Community members can see Jonas Peeters solution.

4/6/2020

Duplicate CBGs with Different States

Yes. Ignore State in home-panel-summary and aggregate within CBG. Product fix coming soon.

4/2/2020

Problem with IOWA CBG 190570010001

Yes. This CBG has been removed from SDM.

3/1/2020

2/25/2020 Artifact (affecting SDM and Patterns)

See this summary of known unexpected data trends for 2/25


Did this page help you?