What caused the Cloudflare outage?
The outage was caused by a bug in Cloudflare's Bot Management system, triggered by a database permissions change that led to an oversized feature file.
Technology / Internet Outages
On November 18, 2025, a significant outage at Cloudflare, a major internet infrastructure provider, caused widespread disruptions across numerous online services and websites. This incident highlights the critical role Cloudflare plays in t...
### Background Cloudflare acts as a critical intermediary for a significant portion of internet traffic, providing CDN, security, and DNS services. Its outage cascaded through the internet, demonstrating how interconnected the web has become.
### Technical Breakdown The outage was triggered by a change in a ClickHouse database system's permissions, causing the Bot Management system to generate an unexpectedly large "feature file." This file, propagated across the network, exceeded software limits, leading to system failures. The issue manifested as HTTP 5xx errors and increased latency for users.
### Impacted Services - **Core CDN and Security Services:** HTTP 5xx status codes were widely served. - **Turnstile:** Failed to load. - **Workers KV:** Returned elevated HTTP 5xx errors. - **Dashboard:** Most users were unable to log in. - **Email Security:** Temporary loss of access to an IP reputation source. - **Access:** Widespread authentication failures.
### Remediation Steps Cloudflare addressed the issue by: 1. Stopping the generation and propagation of the bad feature file. 2. Manually inserting a known good file into the distribution queue. 3. Restarting the core proxy.
### Lessons Learned Cloudflare is implementing measures to prevent future incidents, including: - Hardening ingestion of Cloudflare-generated configuration files. - Enabling more global kill switches for features. - Eliminating the ability for core dumps to overwhelm system resources. - Reviewing failure modes for error conditions across all core proxy modules.
### How to Prepare - **Diversify Services:** Don't rely on a single provider for all critical services. - **Monitor Status Pages:** Keep an eye on the status pages of your key service providers. - **Implement Redundancy:** Where possible, implement redundant systems to ensure business continuity.
### Who This Affects Most Businesses and individuals who rely heavily on Cloudflare for their online presence and services were most affected by this outage.
The outage was caused by a bug in Cloudflare's Bot Management system, triggered by a database permissions change that led to an oversized feature file.
Core traffic was largely restored by 14:30 UTC, with all systems functioning normally by 17:06 UTC on November 18, 2025.
Services affected included core CDN, Turnstile, Workers KV, the Cloudflare Dashboard, Email Security, and Access.
Do you think incidents like this will become more or less frequent in the future? Let us know in the comments!
Share this article with others who need to stay ahead of this trend!
This article was compiled by Yanuki using publicly available data and trending information. The content may summarize or reference third-party sources that have not been independently verified. While we aim to provide timely and accurate insights, the information presented may be incomplete or outdated.
All content is provided for general informational purposes only and does not constitute financial, legal, or professional advice. Yanuki makes no representations or warranties regarding the reliability or completeness of the information.
This article may include links to external sources for further context. These links are provided for convenience only and do not imply endorsement.
Always do your own research (DYOR) before making any decisions based on the information presented.