Elevated rendering errors
Incident Report for imgix

What happened?

On August 26, 2021, at 15:00 UTC, the imgix service experienced disruption caused by long-running processes within our origin cache. Once our engineers identified the issue, remediation changes were applied at 15:09 UTC. After the changes were pushed out, the service sharply recovered at 15:20 UTC.

How were customers impacted?

Starting at 15:00 UTC, requests to non-cached derivative images returned a 503 response. These errors accounted for about 5% of all requests to the rendering service and were sustained until 15:20 UTC when the service sharply recovered.

What went wrong during the incident?

Investigating the cause of the incident, our engineers identified a scenario in which origin connections were misbehaving due to customer configuration settings. While by itself this is not normally a problem, there was some origin activity that had caused the performance of the origin cache to severely degrade, eventually affecting rendering.

What will imgix do to prevent this in the future?

We will be modifying our infrastructure’s configuration to eliminate scenarios where customer configurations are able to cause origin connection issues in our infrastructure. We will also be working with existing customers to optimize their configurations so that they will not be affected by the new changes in our infrastructure.

Posted Sep 10, 2021 - 15:05 PDT

This incident has been resolved.
Posted Aug 26, 2021 - 09:07 PDT
Error rates remain at normal levels and we are continuing to monitor.
Posted Aug 26, 2021 - 08:59 PDT
Our team has applied mitigations and error rates have recovered to normal levels.
Posted Aug 26, 2021 - 08:27 PDT
We are currently investigating elevated render error rates for uncached derivative images. We will update once when we obtain more information.

Previously cached derivatives are not impacted.
Posted Aug 26, 2021 - 08:09 PDT
This incident affected: Rendering Infrastructure.