GitHub Status - Incident History

March 2026 to May 2026

May `2026`

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

May 20, 16:58 - 20:14 UTC

Actions is experiencing degraded availability

On May 15, 2026, from approximately 07:43 UTC to 08:48 UTC, GitHub Actions experienced a degradation that caused workflow runs to fail or experience delayed starts for a subset of customers. The incident was triggered by a planned failover of supporting infrastructure used by GitHub Actions. During that operation, an automated service discovery update did not propagate correctly, which caused traffic to be routed incorrectly and increased request timeouts in a core dependency for workflow orchestration.

At peak impact, 42% of Actions runs failed. Downstream services that depend on Actions workflow execution were also impacted, including GitHub Pages and Copilot cloud services. At 08:12 UTC, responders manually corrected the service discovery routing issue. Timeout and failure rates recovered shortly after, and we continued monitoring until full stabilization was confirmed across all affected services. The incident was marked resolved at 08:48 UTC.

To prevent recurrence, we are implementing failover guardrails that validate service discovery state before completing failover operations, strengthening pre-flight and post-flight verification checks, and improving dependency resilience to reduce timeout cascades during infrastructure events.

May 15, 08:13 - 08:48 UTC

[Retroactive] Incident with GitHub.com

Beginning at 02:49 UTC on May 15 2026 and lasting until 03:04 UTC, GitHub.com was unavailable for a subset of customers. This impact has been mitigated and normal service resumed. The issue was rooted in a sudden spike in traffic, with intermittent impact. We've identified the source of the traffic and prevented further disruption.

May 15, 02:30 - 02:30 UTC

Incident with CodeQL

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

May 13, 14:41 - 16:03 UTC

Incident with CodeQL, Webhooks, Notifications, and Slack Integration

This incident has been resolved. Thank you for your patience and understanding as we addressed this issue. A detailed root cause analysis will be shared as soon as it is available.

May 12, 14:38 - 17:43 UTC

Incident with high errors on Git Operations

On May 11th, 2026, between 14:00 UTC and 14:33 UTC, HTTP-based Git read operations were degraded. On average, the error rate was 2.8% and peaked at 7.5% of requests to the service. This was due to resource exhaustion in a networking gateway between GitHub.com’s frontend service for Git operations and a dependency service that performs authentication and authorization. Following the initial spike, the frontend service became stuck in a degraded state in one of our data centers, increasing time to mitigation.

We mitigated the incident by scaling the networking gateway and re-deploying the frontend service.

To reduce our time to detection and mitigation in the future, we are adding auto-scaling to the networking gateway, and resolving a bug which caused the frontend service to remain degraded.

May 11, 14:25 - 14:33 UTC

CCR and CCA failing to start for PR comments

On May 7, 2026, between 04:12 UTC and 06:13 UTC, Copilot Cloud Agent and Copilot Code Review Agent sessions for pull requests were delayed or failed to start.

The issue was caused by follow-up recovery work from a separate Pull Requests incident (https://www.githubstatus.com/incidents/f5pb5d5mr9yh). As part of that recovery, we ran a large database migration, which caused replication delays on several replica hosts.

Although those replicas were not serving user traffic, our safeguards correctly treated the elevated replication lag as a signal to slow down writes to the affected database cluster. As a result, some pull request background processing was temporarily delayed. That processing is responsible for sending the internal events that Copilot agents use to begin work, so affected agents did not start until the database replicas caught up.

The system recovered once replication lag returned to normal and pull request processing resumed. We are reviewing how this safeguard interacts with recovery migrations so we can reduce the chance of similar secondary impact during future incident recovery work.

May 7, 05:02 - 06:56 UTC

Incident with Pull Requests

On May 6, 2026 between 15:12 and 19:02 UTC creation of new pull request review threads on GitHub.com failed. This included new line comments and file comments on pull requests. Existing PRs and previously created comments were unaffected.

This incident was caused by a 32-bit integer key reaching its maximum value in a Vitess lookup table used during PR thread creation. The primary table had been migrated to a 64-bit integer key but the Vitesse lookup table remained 32-bit. Once the values in the primary table passed the available 32-bit ID space in the lookup table, attempts to create new review threads began failing, resulting in near 100% failure rate for new thread creation requests. We mitigated the issue by updating the impacted lookup table definitions across all shards to use 64-bit integer column types, increasing the available ID range and restoring normal operation. Service was fully restored once the schema changes competed globally.

To help prevent similar incidents, we are expanding existing monitoring of database columns to include Vitess lookup tables to notify in advance of any tables that is approaching a column size limit. This work is intended to provide earlier detection of columns approaching size limits before customer impact occurs.

May 6, 15:25 - 19:04 UTC

May 2026

April 2026

March 2026

May `2026`

April `2026`

March `2026`