On September 01, 2021, 15:33 UTC, analytics and logging for imgix usage had abruptly stopped. During this time, no customer analytics was recorded. This includes data related to image bandwidth, Origin Image counts, and other usage data typically generated from image requests. The issue went unnoticed until the next day on September 02, 15:44 UTC, when a fix was pushed to immediately resume logging.
Customers lost approximately 23 hours of imgix analytics data, though we were able to completely recover Origin Image counts. The affected time range for missing analytics spans from September 01, 2021, 15:33 UTC to September 02, 2021, 15:44 UTC.
In the dashboard, this is represented as dramatically lower bandwidth counts for the dates between 09/01/2021 and 09/02/2021. All other analytics data (such as network usage, audience analytics, network health, etc.) will also show data missing during that time period.
On September 01, 15:33 UTC a breaking change was deployed by our engineering team which had affected data logging in imgix. This change had been tested prior to being pushed to production, though we lacked monitoring on key measurements that would have let us catch the issue before going live with the change. Consequently, the issue went unnoticed until the next day, when one of our staff members had noticed that analytics was not reporting any data in the dashboard.
Once the issue was identified, our engineers rolled back to restore logging functionality. While we were able to recover Origin Image counts, most of the other analytical data (bandwidth, audience analytics, network logs) were lost during the logging outage.
On the monitoring side, we will implement monitoring to track metrics such as bandwidth and usage data to trigger internal alerts when data deviates greatly. These changes will be implemented across all applicable systems.
We will also be updating our tooling to allow us to recover and replay data in the event that usage logging is disrupted.