?

How 2025’s Cloud Outages Affected Crypto - And Why Ankr Stayed Online

Kevin Dwyer

Kevin Dwyer

December 8, 2025

13 min read

ankr_250415_landmarks_01_ankr_250415_landmarks_10_2025-12-02_00.38.04.png

In 2025, crypto wasn’t compromised by a new DeFi exploit or a catastrophic consensus bug. It was stress-tested by something much more mundane: the same centralized infrastructure that the broader internet has been relying on for years.

Both AWS and Cloudflare went down. More than once. And when this happened, familiar patterns unfolded:

  • CEXes froze withdrawals.

  • dApp and DeFi front ends stopped loading.

  • Wallets and dashboards showed zero balances or timed out.

  • Users asked the uncomfortable question out loud:

    If a single company’s outage can take my “decentralized” app offline, how decentralized is this really?

Throughout all of this, Ankr’s node and RPC infrastructure continued to serve traffic. That wasn’t a coincidence but the result of very specific choices about how our network is built: bare-metal servers, our own private global network, and a blockchain-aware load balancer that routes around the exact kinds of failures that took others down.

This article keeps the focus on three things:

  1. A clear timeline of the 2025 outages that hit the crypto industry.
  2. Why those incidents didn’t impact Ankr’s node layer (and by extension the protocols that rely on us).
  3. A concrete playbook for teams that want their dApps, DEXes, and exchanges to stay online the next time AWS or Cloudflare has a bad day.

The goal is not to rehash headlines. It is to make it obvious that resilience is an architectural choice, not a marketing slogan.

The 2025 Outage Timeline

October 20 - The AWS us-east-1 incident that took down Exchanges and L2s

On October 20, AWS suffered a significant regional outage in its us-east-1 region. The root cause was a DNS resolution failure that rippled across core services.

For crypto, the impact was immediate and very public:

  • Coinbase reported severe service disruptions, with more than three hours of degraded performance for users across its platform.
  • Robinhood experienced similar downtime as its trading systems, also hosted on AWS, were affected.
  • Base and other Ethereum L2s saw infrastructure issues and reduced availability as components tied to AWS became unreachable.

None of this was a blockchain failure. Ethereum and the L2s in question kept producing blocks. What stalled was the ability of users to access these systems because core infrastructure had been concentrated in a single cloud region.

The post-mortems did not tell us anything conceptually new. They confirmed something the industry has known for years: a large percentage of crypto’s “critical path” runs through us-east-1.

November 18 - Cloudflare’s global outage

Less than a month later, the spotlight shifted from AWS to Cloudflare.

On November 18, 2025, a bug in Cloudflare’s Bot Management system generated an oversized internal feature file. As that file propagated across their network and exceeded system limits, core proxy services began to fail worldwide.

From a user perspective, it was simple:

  • Requests to many Cloudflare-protected sites returned Cloudflare error pages instead of application responses.
  • A wide range of web services, including crypto platforms, appeared “down” even though their origin servers and nodes were still running.

The important detail here is where the failure sat in the stack. The outage did not hit databases or app servers directly. It hit the CDN and security layer that so many apps, including Web3 front ends, now route through by default.

December 5 - Cloudflare’s second outage hits Coinbase, Kraken, and DeFi UIs

On December 5, Cloudflare suffered another major incident. This time, a Web Application Firewall change meant to address a vulnerability resulted in a coding error that broke a significant portion of its traffic. Around 28 percent of Cloudflare’s traffic was affected for roughly 25 minutes.

The crypto-specific impact was straightforward and very visible:

  • Coinbase and Kraken experienced disruptions as Cloudflare routing and security features failed in front of their web properties.
  • Multiple DeFi front ends and crypto platforms saw service interruptions, with users unable to load interfaces or submit transactions through impacted sites.

Again, the underlying chains and their smart contracts didn’t break. The failure happened in the internet infrastructure between users and the chain.

The bigger backdrop - “soft” outages at infra layers

Alongside the headline AWS and Cloudflare events, the year was dotted with more subtle, but still painful, infra issues:

  • Partial or regional disruptions at node providers that affected RPC performance and availability.
  • Wallets and Web3 dashboards that rely on a single provider suddenly showing errors or empty data.
  • Explorer and indexing outages that left users without visibility into what was actually happening on chain.

Most users do not differentiate between a full outage and degraded performance. If their transaction does not go through, or their favorite app does not load, the result feels exactly the same. Taken together, 2025 drew a clear map of where the real chokepoints are.

What These Outages Actually Told Us

Strip away the incident reports and status pages, and 2025 surfaced one core truth:

Blockchains were mostly fine, but access to them was not.

The outages were symptoms of deeper architectural choices that the industry has quietly made over the last cycle.

Concentration in a few cloud regions

Many exchanges, wallets, and infra providers have ended up heavily concentrated in a small number of cloud regions, especially AWS us-east-1. When that region has trouble, there is no real failover strategy. The entire stack above it feels the same failure at once.

Single CDN or security layer in front of everything

Cloudflare has been an obvious choice for performance and protection, so a huge fraction of the internet sits behind its proxies. When Cloudflare misconfigures a rule or ships a buggy update, anything behind it is effectively down from the user’s perspective, even if origin servers are perfectly healthy.

Node providers that sit on the same foundations

Many node and RPC providers that are supposed to help diversify infrastructure are themselves almost entirely built on the same large clouds and regions. So when AWS has a bad day, those providers have a bad day too, and that ripples out to thousands of dApps that point to a single RPC URL.

Policy and jurisdiction as hidden risks

Even when the hardware is up, centralized infra providers are subject to local laws, sanctions pressure, and regulatory requests. We have already seen examples in prior years where access to certain chains or services was geofenced or restricted at the infra level. That risk remains, and it is structurally at odds with Web3’s stated goals.

In other words, a large part of crypto is telling a decentralization story on top of a very centralized dependency graph.

Why None Of This Took Ankr Down

While AWS and Cloudflare incidents were taking pieces of the ecosystem offline, Ankr’s infrastructure kept answering RPC calls.

That resilience wasn’t accidental but the result of three long-term decisions:

  1. Build on bare metal instead of treating public cloud as the default.
  2. Move traffic from third-party CDNs onto a private global network.
  3. Use a blockchain native load balancer that understands chains, not just HTTP codes.

Bare-metal servers instead of cloud monoculture

Most node providers default to AWS, GCP, or similar platforms. Ankr took a different route.

  • We operate a blend of bare-metal and cloud servers across more than 30 global regions, with a heavy emphasis on dedicated bare-metal environments optimized for blockchain workloads.
  • These machines are placed in independent data centers around the world instead of being tightly coupled to a single cloud vendor’s zones.
  • Hardware is tuned for high I/O, memory, and networking throughput that full nodes, validators, and indexers require.

When AWS us-east-1 stumbled in October, it didn’t automatically pull Ankr’s node fleet down with it. Our core capacity is not bound to a single hyperscaler region. That is the structural difference.

Migrating RPC traffic to a private global fiber network

Historically, like many others, Ankr relied on Cloudflare for some edge and routing functions. Recently, we completed one of the most important upgrades in our history: migrating our RPC traffic off Cloudflare and onto a private global fiber network operated by our sister company, Asphere.

This change had several immediate effects:

  • User traffic between clients and Ankr endpoints now travels predominantly over a private backbone instead of hopping across the public internet.
  • Routing decisions are controlled end-to-end by Ankr and Asphere, which gives us more visibility and more options when rerouting around congestion or local issues.
  • Cloudflare outages no longer impact our RPC API traffic, which means the November 18 and December 5 incidents did not automatically translate to downtime on Ankr’s side.

A blockchain-aware load balancer

On top of bare metal and private fiber, Ankr runs a custom load balancer written in Go and designed specifically for blockchain traffic. This is not a generic web load balancer. It has opinions about chains.

Concretely, it can:

  • Route based on chain height and sync status, not just response codes.
  • Differentiate between full nodes, archive nodes, and validators, and send different types of requests to appropriate targets.
  • Continuously monitor latency, error rates, and health signals per region, and shift traffic away from nodes or data centers that show degradation.

When there is a localized issue, the system does not wait for something to completely fail. It quietly starts sending traffic to healthier parts of the network. Combined with the private backbone, this gives us a lot of room to maneuver around real-world incidents.

Result: Node infrastructure that keeps running when others pause

During the outages highlighted earlier:

  • The AWS us-east-1 incident hit cloud-heavy stacks directly, but Ankr’s node fleet continued serving supported chains because we are not anchored to that one region.
  • The November 18 and December 5 Cloudflare outages disrupted exchanges and DeFi interfaces that sat behind Cloudflare’s network, while Ankr’s RPC API, already migrated to Asphere’s private network, continued to operate independently of Cloudflare’s status.

Some projects that use Ankr likely had their own front ends or microservices impacted by those incidents if they still depended on AWS or Cloudflare at other layers. But at the node and RPC layer, the infrastructure they were plugged into remained online. That is the difference this architecture is designed to create.

What This Means for Protocols Building On Ankr

When an outage happens upstream, there are really two questions that matter for any protocol or platform:

  1. Can your users still reach the chain?
  2. Can your backend still stay in sync with the chain?

If the answer to both is yes, you have options, even if one UI, one region, or one provider is having a problem.

Because Ankr’s infrastructure stayed up during the 2025 incidents:

  • Protocols using Ankr for their primary RPC had a live, functioning path to the chain, even while some centralized exchanges, wallets, or frontends were struggling.
  • Indexers, sequencers, and backend systems that relied on Ankr could keep processing blocks and events instead of falling behind.
  • Teams had the option to route advanced users to direct RPC access or alternate interfaces while they worked on restoring any affected front-end or cloud-hosted components.

This is the practical meaning of “nearly impervious” to cloud and CDN outages. It does not mean the broader internet will never fail around you. It means that the core piece that talks to the blockchain is built to keep running when it does.

What Should Change For dApps, DEXes, and Exchanges

The lesson from 2025 is not that AWS or Cloudflare are “bad.” They deliver enormous value and will continue to power a large share of the internet.

The lesson is that if your protocol’s liveness depends on a small number of providers and regions that can fail in correlated ways, you do not actually have the resilience you think you have.

A few realities are now hard to ignore:

  1. A chain can be decentralized while access to it remains centralized in practice. The failures users experienced this year were almost all in access layers, not consensus layers.
  2. Diversification is not just “multi cloud.” If both of your setups still rely on the same CDN or same dominant region, you have not really diversified your risk.
  3. Infrastructure design is now a trust signal. Users notice who keeps working during bad weeks for the internet. Uptime is not just a KPI, it is a narrative about whether your protocol is built like an experiment or like a utility.

For teams that want to respond to this moment instead of just hoping the next outage does not hit them, there are some concrete moves to make.

A Practical Resilience Playbook

You do not need to rebuild your entire stack overnight. But if you want your protocol to behave differently next time AWS or Cloudflare goes sideways, some choices matter more than others.

Decouple your node layer from hyperscale clouds

  • Treat cloud regions as optional, not mandatory, for your core blockchain connectivity.
  • Use providers that operate their own distributed bare-metal stacks and are not fully dependent on one cloud vendor or region.

Pointing your RPC traffic at Ankr is one way to do this in practice. It does not stop you from using AWS for other things, but it removes a major single point of failure in the path between your app and the chain.

Reduce your exposure to a single CDN in the critical path

  • If you rely on Cloudflare or similar services for everything from DNS to WAF to proxying, assume that a failure there will make your app look “down” to users.
  • For your RPC and node access, prefer providers that terminate requests on their own networks rather than sitting behind a third-party CDN.

Ankr’s migration to Asphere’s private backbone is an example of what that looks like at scale.

Use infra that is aware of blockchain behavior

Generic web load balancers can not tell the difference between a node that is one block behind and one that is hundreds of blocks behind. For critical systems, that is not acceptable.

  • Prefer infra that checks sync status, role, and latency as first-class health signals.
  • Make sure your provider can automatically route around lagging or unhealthy nodes.

This is exactly why Ankr invested in a blockchain-native load balancer instead of relying solely on standard web tooling.

Assume front ends will fail and design around it

Even with the best architecture, some front ends will still sit behind Cloudflare or other CDNs. That is fine, as long as it is not the only door.

  • Document and support alternate paths to your protocol: direct RPC interaction, SDKs, or CLI tools for advanced users and partners.
  • Consider multi-front-end strategies that are hosted in different environments but talk to the same underlying infra.

If the node layer is resilient, you can afford to lose one UI temporarily without losing the protocol.

Making 2026 A Turning Point

2025 put centralized infra risk right in front of everyone.

AWS outages reminded the industry how much of crypto still sits on a few cloud regions. Cloudflare outages reminded us how many front ends and APIs have been routed through a single chokepoint. The incidents were disruptive, but they were also clarifying.

Ankr’s experience this year showed that another path is possible:

  • Bare-metal infrastructure instead of total dependence on rented cloud.
  • A private global fiber network instead of routing everything over the public internet and a single CDN.
  • A blockchain-native load balancer, rather than purely generic web infrastructure.

Those choices are why Ankr and the protocols that rely on us for node access stayed online while centralized components around the ecosystem kept blinking off and on.

The next AWS or Cloudflare outage is not hypothetical. It will happen. The open question is which projects will again be forced to tweet “funds are safe, but withdrawals are paused,” and which ones will quietly stay available.

If you want to be in the second group, the work starts at the infrastructure layer. That is the layer Ankr was built to harden.

Join the Conversation on Our Channels!

X | Telegram | Substack |  Discord | YouTube | LinkedIn | Reddit | All Links