Resolved
The incident is now resolved and the system is full operational.
Some functions that encountered issues during the incident window may need to be manually re-run.
Monitoring
The fix has resolved the issue with function execution and we are monitoring the system. The team has pinpointed the cause related to a deploy around 22:53 UTC.
Functions that failed during the outage may need to be replayed.
Identified
We've deployed a fix to bring function execution back to normal operation. New functions are executing.
The issue with function execution may have caused issues with certain functions causing them to enter into a failed state.
Identified
We've identified an issue with the system and have rolled back a configuration change to fix the issue.
We have also identified that function execution was affected for some users. Some users are seeing degraded performance and other users have encountered functions failing to execute. We are continuing investigation to understand scope of impact.
Investigating
We are actively investigating an issue with the dashboard We will provider further updates as we identify the cause and resolve the issue.
Function execution and event ingestion is still working.