“Cloudflare outage cuts access to X, ChatGPT, Perplexity, Spotify and other web platforms for thousands.”

Cloudflare outage on 18 November 2025—what led to it, what was affected, how it was resolved, and why it matters

What happened

  • At around 11:20 UTC (≈ 4:50 pm IST), Cloudflare’s network began experiencing significant failures to deliver core traffic to its customers. 

  • The immediate symptom: many websites/apps returned HTTP 500 errors or “please unblock challenges.cloudflare.com to proceed” messages.

  • A number of major platforms were affected, including ChatGPT, Spotify, Perplexity, X (formerly Twitter), Canva, gaming services like League of Legends, and even infrastructure services like NJ Transit in the U.S. 

The root cause

  • Cloudflare says the problem was not a malicious cyber-attack. 

  • Instead: a routine configuration change triggered a bug. A database change caused a “feature file” in Cloudflare’s bot-management system to grow much larger than expected. That large file was propagated across Cloudflare’s network, exceeding size limits for the software that reads it, causing cascading failures. 

  • Because Cloudflare handles many critical services (CDN, DNS, DDoS protection, traffic routing) for large parts of the internet, when its network faltered it had outsized impact. 

Impact

  • Thousands of user reports via outage-monitoring services (e.g., Downdetector) spiked for the affected platforms.

  • The outage affected not only customer-facing platforms but internal services of Cloudflare (dashboard, API, control plane) which complicated recovery. 

  • Because numerous unrelated companies relied on the same infrastructure provider, many separate industries saw disruptions: streaming/music, AI/chatbots, gaming, transportation, retail.

  • Some services experienced cascading effects: when the gatekeeper infrastructure fails, access to many downstream services is impacted.

How was it resolved

  • Cloudflare identified the feature-file issue, rolled back to an earlier version, and stopped the propagation of the faulty file. 

  • By around 14:30 UTC the majority of core traffic was restored. Full remediation (including dashboard/API parts) completed later. 

  • Cloudflare issued an apology and committed to investigation and monitoring to prevent recurrence. 

Why it matters

  • It highlights a key vulnerability in modern digital infrastructure: centralization of critical routing/security services means a fault in one provider cascades broadly.

  • Even platforms you think of as “independent” (e.g., apps, games) can be taken offline if they rely on a shared backbone.

  • For business continuity and risk management, companies must consider redundant architectures and contingency planning (e.g., multi-provider setups) if they rely on third-party infrastructure.

  • For users, it’s a reminder that service reliability isn’t just about “the app” but the underlying network layer, which often operates out of sight.

Key takeaways

  • The outage was not due to hacking—rather a routine change + latent bug.

  • The knock-on effect for many platforms was substantial, despite them not having direct faults themselves.

  • Recovery was fairly swift (a few hours) but still disruptive enough to make headlines.

  • For India / Asia region users: such global infrastructure issues don’t respect time zones—if a backbone provider fails, users anywhere can see effects.

  • Going forward: expect more scrutiny of infrastructure providers and maybe more diversification in how services are architected.

Comments

Popular posts from this blog

HENRY GRAHAM GREENE (GRAHAM GREENE)

BLACKPINK

KRISTEN JAYMES STEWART