Scaling Security Insights: how we achieved a 10x increase in global scanning capacity
Curated from Cloudflare Blog
If you're responsible for maintaining high-throughput security systems in a distributed environment, this article offers practical lessons in scaling without proportional increases in infrastructure. Cloudflare’s approach to optimizing Kafka consumers, database queries, and API performance demonstrates how architectural efficiency can yield substantial gains in processing capacity. The focus on throughput improvements—rather than just scaling out—resonates with SRE and DevOps teams aiming to maximize resource utilization. Their ability to achieve a tenfold increase in scanning capacity while avoiding additional hardware costs is a compelling case study in performance engineering. For practitioners, the takeaway is clear: profile and optimize bottlenecks at the component level before reaching for horizontal scaling.
Cloudflare Security Insights system now processes over 120 scans per second, providing frequent insights for all customers. By optimizing Kafka consumers, Postgres queries, and our API, we scaled our throughput 10x without adding hardware.
— Cloudflare Blog