Studies of blockchain architectures often start with the consensus algorithms and implicitly assume that information flows perfectly through the underlying peer-to-peer network, and peer discovery is sound and fully decentralized. In practice this is not always the case. A few years ago, a team of researchers looked at the Bitcoin1 and Ethereum2 networks in two papers, and uncovered various sources of eclipse attacks. In an eclipse attack, the attacker targets a node and isolates it from the rest of the network’s honest nodes by monopolizing all of its connections, hence eclipsing the victim from the latest true state of the network. In this blog post, we will survey the overlooked design elements that open the network to eclipse attacks and discuss some defense-in-depth measures. NCC Group regularly assesses blockchain network implementations, and unfortunately continues to observe design patterns that can lead to eclipse attacks.
Eclipse attacks on peer-to-peer networks such as Distributed Hash Tables (DHT) have been studied since the mid-2000s, with updates to the routing table (equivalent of ADDR messages3 in Bitcoin) weaponized to poison honest nodes’ peer tables. The trustless, open, and minimally structured architecture of blockchain networks make them even more susceptible to cache-poisoning and eclipse attacks.
When eclipsed, a miner is isolated from the network and its work is no longer productive. This victim essentially works on top of an orphaned chain which has the following implications:
Splitting the mining power shrinks the total mining power, and therefore, the attacker can have a higher share of the mining rewards. This facilitates 51%-mining-power attacks as well as selfish-mining attacks4.
Competing views of the blockchain confuse merchants that query the victim miner to determine whether a transaction’s state is finalized, before they release the goods. This is akin to double-spending. Similarly, light clients that use the victim miner as an anchor will have a corrupted view of the chain which can be exploited in all sorts of ways.
Competing views of the blockchain make smart contracts unreliable since they treat the blockchain’s state as permanent storage.
While the impact of eclipse attacks on proof-of-work networks is well-studied, eclipsing a miner/validator in a proof-of-stake system has remained largely unexplored to date. This is in spite of the fact that it could result in financial loss for the victim and could slow down the consensus protocol.
It is worth emphasizing that the vulnerabilities described here have already been mitigated by the Bitcoin and Ethereum maintainers.
Mechanics of the Attack
The eclipse attacker aims to monopolize all of the victim node’s incoming, then eventually also outgoing connections. The attack’s steps will be different on various blockchains but they all have one element in common: they utilize the blockchain’s peer discovery algorithm to poison the victim’s peer table, wait for the victim node to restart (which any machine inevitably does), and flood it with attacker-controlled requests. Let’s first look at a vulnerable version of Bitcoin and then see how attacking Ethereum required even fewer resources.
In case of Bitcoin, each node has a limited set of up to 8 outgoing connections and up to 117 incoming connections5. Nodes maintain two tables, namely tried and new tables, to monitor the state of their past and present peers. The tried table is partitioned into buckets which each store up to 64 unique peer IP addresses (with incoming or outgoing connections) along with their groups which is defined as /16 IPv4 prefix for the peers’ IP addresses. The index of the bucket for a given peer’s address is determined as a function of its random node identifier, its IP address and port number, and its group. The new table is populated by addresses that are received from the DNS seeders or ADDR messages; and as such their connectivity status is unknown.
The goal of the attacker is to isolate the node by continuously introducing it to controlled malicious nodes. Replacing addresses in the victim’s tried table requires establishing unsolicited incoming connections to the victim from nodes with various IP addresses. Once the attacker connects to the victim from an adversarial address, it can send (unsolicited) ADDR messages with 1000 trash addresses. The victim will naively add these addresses to its new table without testing their liveness. This elaborate scheme is made easier by the fact that nodes contact their peers for network information infrequently, even when they are under attack by the adversary.
The attack will progress once the node restarts and uses the addresses in the tried table (which are persisted on disk) to establish outgoing connections. Nodes may restart for various reasons, such as DoS attacks and ISP outages, or due to planned software updates. Security of a decentralized peer-to-peer network should not depend on 100% node uptime. Once the attacker controls all the victim’s outgoing connections, it shifts to monopolizing its incoming connections to fully isolate it. At that point the rest of the network assumes the victim was offline, and after 30 days, its peers will mark it as “terrible” and will forget it.
The success of this attack depends on the time invested and the number of adversarial IP addresses in a range of groups. The attacker can control the bandwidth cost of the attack by refusing to respond to requests (e.g., inventory request) that will require sending large payloads. The paper Stubborn Mining: Generalizing Selfish Mining and Combining with an Eclipse Attack argues that a group of miners might be incentivized to collude and eclipse a more powerful miner.
Ethereum’s peer discovery is more involved than Bitcoin; by default it has 13 outgoing connections and peer-to-peer messages are authenticated. In an attempt to design for a future when sharding would be supported to help scale the network, and also to ensure uniform network connectivity, Ethereum is modeled after Kademlia, which was originally invented for distributed file sharing on BitTorrent. The upshot was that the peer tables were public so nodes could be discovered with a bounded number of hops (logarithmic in the size of the network) and a biased distance measuring algorithm was used to order peers based on their identifier distance to the current node. Additionally, Ethereum node identifiers are simply their ECDSA public keys, allowing multiple nodes to be run on the same machine. This significantly lowers the cost of the attack to only 2 machines with 2 distinct IP addresses. Since nodes favor peers with lower identifier distance to their identifier, peer discovery is biased. The attacker generates an ECDSA key pair, calculates its corresponding node identifier and its distance to the victim’s node identifier, and finally gauges the probability that the victim includes it in its (public) peer table. The attacker repeats this procedure locally until it obtains a list of node identifiers that are most likely to be added to the victim’s peer tables.
The Geth Ethereum clients (prior to version v1.8.0) ran an eviction process every hour to ensure their peers are online and responsive. When a peer failed to respond after a predetermined number of times, it was evicted. So the attacker has to keep its nodes up for an extended period of time, until the victim is fully eclipsed.
In addition to these network attacks, the researchers observed that the UDP connections’ validity checks were highly sensitive to message timestamps. Nodes would reject packets with timestamps that differed more than 20 seconds from their clock. In real world applications, it should be assumed that a (motivated) attacker is able to manipulate a machine’s clock locally. It was shown that by skewing the victim’s clock, it would gradually isolate itself from the rest of the network, and consequently, the network would forget the node over time. Using nonces to track request and response messages, and calculating time differences locally is superior to trying to tolerate network delays with arbitrary limits.
Another takeaway is that when borrowing and adopting an algorithm (in this case using Kademlia with the plan to support sharding in the future), one has to pay close attention to its side-effects and whether they outweigh its initial costs. The Ethereum foundation has recently removed sharding from its roadmap and has replaced it with “DankSharding”6.
The following recommendations are summarized from the two papers, cited in the reference section:
Favor peers with the longest history of successful connections rather than timestamp freshness.
Before evicting a seemingly older address from the table, attempt to connect to it and evict it only if the connection fails. This way the attacker will not be able to evict legitimate addresses. Success of this measure depends on the number of legitimate addresses before the attack begins.
Occasionally establish short-lived connections (aka feeler connections) with addresses in the new table and include them in the tried table if they are alive. This increases the chance of them being online when the node restarts. Evicting new addresses that are trash cleans up the new table.
Keep track of the timestamp of the first time a peer established connection and use two of the oldest peers as anchors when the node restarts (assuming they accept incoming connections).
Increase the size of tried and new tables, and periodically ensure that they are filled with legitimate and online addresses. For instance, by crawling the network to discover new peers.
Drop unsolicited (and large) ADDR messages from incoming connections to make filling the new table with trash addresses harder.
Ensure that the incoming connections are diverse, and a single or a few addresses cannot monopolize all the incoming connections.
Monitor the network and detect anomalies, e.g., a flurry of incoming large (mostly trash) ADDR messages or a drop in network’s mining power (i.e., increase in stale blocks). These can be used as a signal to act against a potential ongoing eclipse attack.
Make at least a minimum number of outgoing connections that did not originate from the unsolicited incoming connections. With this measure, an attacker would have a harder time monopolizing all of the victim’s connections.
Identify nodes by their public key in combination with another parameter, such as their IP address, to stop low-resource attackers who run multiple nodes on the same machine.
When there are no legitimate use cases for iterative lookups, do not publicize a node’s connection table. This would make it harder for an adversary to guess if a connection by their controlled node will be accepted by the victim.
Run seeding upon every reboot (even when the peer table is not empty) to discover fresh peers and avoid being isolated by an attacker. Try running the seeding procedure proactively to prevent the attacker from flooding the victim with incoming connections.
Since the adopted layer-one blockchain represents a base security layer for the add-on layers, it is imperative to design the underlying network protocols while having eclipse attacks in mind. This blog post aimed to bring more attention to this topic by summarizing some of the relevant research that has been conducted in the past decade. Following the above recommendations could greatly reduce the vulnerability of blockchains to eclipse attacks. However, It should be noted that these recommendations must be considered in combination with the given application’s threat model to avoid unintended consequences. Author refers the interested reader to the referenced papers for more details about the peer discovery algorithms and trust network formation in Bitcoin and Ethereum as the two dominant blockchains today.
The author would like to thank Gerald Doussot, Aleksandar Kircanski, Eli Sohl, and Paul Bottinelli of NCC Group’s Cryptography Services team for their review of this post. All mistakes remain with the author.
Low-Resource Eclipse Attacks on Ethereum’s Peer-to-Peer Network, by Yuval Marcus, Ethan Heilman, and Sharon Goldberg. ↩
A Bitcoin ADDR message can contain a maximum of 1000 addresses. See https://en.bitcoin.it/wiki/Protocol_documentation#addr for more details. ↩
A selfish miner (or a pool of miners) withhold mined blocks to gain advantage over the reset of the network in mining on the longest chain, see https://www.investopedia.com/terms/s/selfish-mining.asp for a definition. ↩
Bitcoin still uses the same number of inbound and outbound connections, see MAX_OUTBOUND_FULL_RELAY_CONNECTIONS (for 8 outbound connections) and DEFAULT_MAX_PEER_CONNECTIONS (for 125 total connections; thus 117 inbound connections) in the source code. ↩
From https://ethereum.org/en/roadmap/danksharding/: “Neither Danksharding nor Proto-Danksharding follow the traditional “sharding” model that aimed to split the blockchain into multiple parts. Shard chains are no longer part of the roadmap. Instead, Danksharding uses distributed data sampling across blobs to scale Ethereum. This is much simpler to implement. This model has sometimes been referred to as “data-sharding”.” ↩