Recently proponents of transaction "filtering" have started sybil attacking Libre Relay nodes by running nodes with their "garbageman" fork¹. This fork falsely advertise the NODE_LIBRE_RELAY service bit, silently discards transactions that would be relayed by real Libre Relay nodes, and does not provide any. Additionally, they have made clear that they intend to ramp up this sybil attack with the aim of preventing people people from getting transactions that they disagree with mined: The costs will increase even more once Libre Relay’s DoS attacks on bitcoin are countered by enough defensive nodes. -Chris Guida https://delvingbitcoin.org/t/addressing-community-concerns-and-objections-regarding-my-recent-proposal-to-relax-bitcoin-cores-standardness-limits-on-op-return-outputs/1697/4 They have also put effort into making the attack more than a simple proof of concept, e.g. by adding code that attempts to make it more difficult to detect attacking nodes, by keeping track of transactions received from peers, and then replying to inv messages with those transactions even when they were discarded². With this attack in mind, I thought this would be a good opportunity to review the math on how effective this type of attack is, as well as some of the mitigations that could be implement to defeat sybil attacks on transaction relaying. In particular, I'll present a defense to sybil attacks that is sufficiently powerful that it may even negate the need for preferential peering techniques like the NODE_LIBRE_RELAY bit. Note that I don't deserve credit for any of these ideas. I'm just putting down in writing some ideas from Gregory Maxwell and others. # The Effectiveness of Sybil Attacks on Transaction Relaying Non-listening nodes make a certain number of outgoing, transaction relaying, connections to listening nodes. In the case of Bitcoin Core, 8 outgoing transaction relaying nodes; in the case of Libre Relay, an additional 4 outgoing connections to other Libre Relay nodes to relay transactions relevant to them. For a sybil attack to succeed against a non-listing node, every one of the N outgoing connections must be either a sybil attacking node, or a listening node that itself has been defeated by sybil attack. Additionally, Bitcoin Core makes outgoing IPv4 and IPv6 connections to a diversity of address space, so the sybil attacking nodes need to themselves be running on a diverse set of IP addresses (this is not that difficult to achieve with VPS providers these days). Thus if the sybil attacking nodes are a ratio of q to all nodes, the probability of the attack succeeding is q^N. Against Libre Relay, N=4, this means that the attacker needs to be running ~84% of all NODE_LIBRE_RELAY advertising nodes to have an attack success probability of ~50%. Based on information from my Bitcoin seed node, there appear to be about 15 Libre Relay nodes, so for a 50% attack success probability the attackers would need to run about 85 attack nodes. If N was increased to 8, the attackers would need about 172 nodes to achieve the same success rate. Against *listening* nodes a different type of attack is necessary. The reason for this is that defenders can easily defeat sybil attacks against listening nodes by simply connecting to ~all listening nodes at once to ensure that transaction propagation succeeds. Of course, the attacker can in turn do things like attempt to exhaust connection slots of Libre Relay nodes, or simply DoS attack them with packet floods. But those are different types of attack than the sybil attack we are discussing here. # Prior Art: Defeating Block Propagation Sybil Attack Bitcoin Core already includes a defense against sybil attack for block propagation: the feeler node system. Basically, every ~2 minutes an outgoing connection is made to a gossiped address to check if a connection can be made; successful connections are recorded in a table of "tried" addresses. If no new blocks have been received for 30 minutes, these tried addresses are then used every 10 minutes to try to find a peer that does know about a new block. Since this process goes on indefinitely, so long as outgoing connections are themselves not censored (e.g. by the ISP), the node should eventually find a non-sybil attacking node and learn about the true most-work chain. Even in normal operation periods of >30minutes between blocks are fairly common, so this defense will (eventually) work even if a forked chain exists with some hash power extending it. This approach is relatively straightforward for block propagation, as there is a clear metric: the most-work chain. Peers that aren't giving you the most-work chain can be ignored, and new peers found. Proof-of-work's inherently self-validating property means that doing this is cheap and straight forward. # Directionality A subtlety to the information censorship sybil attack is there are actually two different simultaneous attacks: the attack on preventing you from learning about new information, and the attack on preventing you from distribute new information to others. With block propagation, most nodes most directly care about the first class of attack: they want to learn about the most-work chain, and do not want that information censored from them. For miners, in addition to knowing what the most-work chain is, they (typically³) have a strong incentive to get their new blocks to all nodes as quickly as possible. Also, all nodes have at least some incentive to do this as Bitcoin will not function properly if miners are getting censored. These attacks are not the same! The most-work-chain metric is only directly detecting and preventing the first class of attack. It only prevents the second attack indirectly, by making it easier for honest nodes to learn about new blocks and attempt to themselves propagate that information further. # Most Fees Metric For transaction relaying, the moral equivalent to the most-work chain metric are metrics based on the amount of new transaction fees that peers are advertising to you. Unfortunately this isn't as straightforward to implement as the most-work chain metric for a few reasons: 1) Resolution: differences in chain work are very clear, with even a single additional block being a very significant difference. For transaction relaying, we'd like to be able to successfully relay transaction types that only add a small % to total fees. 2) Bandwidth: a chain of 80 byte headers is sufficient to prove most-work; transactions are much larger. 3) Double-spends: mempools are not a consensus. Your peers may have transactions that conflict with your transactions, yet in ways that don't constitute a worthwhile RBF replacement (e.g. two different transactions with the same fees and fee-rate). For example, one straight-forward approach would be to simply keep track of a decaying average of new fees/sec each peer had advertised to you prior to you advertising the transaction to them. Periodically, you could drop the peer with the lowest new fees/sec ranking, and then connect to a new peer. However, it's not clear that this approach has sufficient resolution to actually detect censorship of relatively uncommon transaction types. Additionally, since transaction broadcasting is a one-shot event - we don't have a mempool synchronization mechanism - this approach may not work well if transaction demand is bursty. # Most-Fees Next (Dobule) Block Mempool With the upcoming cluster mempool functionality that is expected to be added to Core in the near future, transactions will be stored in memory in clusters ordered by fees: essentially the order in which optimal blocks would be created. This will make it computationally cheap to determine what the optimal next block (or blocks) will be by simply iterating through transactions in order, and stopping when N weight worth of transactions have been found. Thus nodes can cheaply compute the total fees in the top one or two blocks worth of transactions they currently have in their mempool, and advertise this fact to their peers. Finally, to prevent lying, we can add a mechanism for a peer to get a copy of all these transactions to ensure that they're not missing out on anything paying enough fees to get mined soon. While beyond the scope of this summary, there are many set-reconciliation techniques available to do this in a bandwidth efficient manner. Basically, through the existing transaction relay mechanisms we can expect mempools to be relatively consistent between nodes. Thus, to get all transactions that your peer has for the next block or two that you do not, you just need to transfer the deltas between their next-block(s) mempool and yours. Concretely, suppose we do this with the next two blocks worth of transactions. At worst, each node would need to periodically create a maximum 8MB serialized "double-block", using up to 8MB of ram. Secondly, to apply this to all outgoing connections, you'd need to periodically use a set-reconciliation protocol to download the differences between each of your outgoing peers' double-blocks, and attempt to add any newly discovered transactions to your mempool. At worst for 8 peers this would be 64MB of useless data to download, assuming every single transaction was a conflicting double-spend. Not great. But not that bad. As with the average fees idea, periodically you would drop the peer advertising the lowest double-block of fees, and then connect to a new peer to see if they're better. Now consider what happens if you are sybil attacked. Due to RBF, with synchronous mempools across different nodes with the same standardness policies will have very similar transaction sets; even without active synchronization long-running mempools across different nodes are already very similar in terms of total fees. Thus even a small difference in transaction relay policy will show up as missing transactions. This difference will translate into the sybil attacking node(s) getting dropped, and honest nodes with policy compatible with yours eventually being found. ## Peers With More Liberal Relay Policy If you apply set reconciliation to a peer with a *more* liberal relay policy than you, they'll have transactions that you will not accept. For example, imagine the case of a peer that now accepts a new version number. One way to deal with this could be to just drop peers that give you transactions that you consider non-standard. So long as reconciliation is only applied to a subset of all transaction relaying peers, this is fine. Indeed, even if this is applied to all transaction relaying peers, Bitcoin Core already connects to additional peers in blocks-only mode. So you'll still get send and receive blocks and maintain consensus. ## Privacy Tracking what transactions are in mempools is a potential way for attackers to trace transactions back to their origin. Provided that set-reconciliation is only a secondary transaction relay mechanism, with sufficient time delays, this should not impact privacy as under normal operation transactions will have already propagated widely making the set reconciliation data non-sensitive. # Manual Peering With Known-Honest Friendly Nodes More of a social solution than a technical solution, we should encourage people to manually peer with other nodes they have a personal relationship with. This is a powerful technique against sybil attacks for the simple reason that person-to-person relationships can evaluate honesty in much more powerful ways than any code could possibly do so. At the moment, actually doing this is inconvenient. Ideally we would have a mechanism where node operators could get a simple pubkey@address connection string from their node to tell to their friends, and equally, import that same connection string into their bitcoin.conf. This mechanism should use some kind of node identity to defeat MITM attacks, and also ensure that connection limits are bypassed for friendly nodes. The existing addnode mechanism doesn't quite achieve this. Notably, without a node identity mechanism, there's no way for someone with a static IP address to whitelist a friend's node with a non-static IP address. # Footnotes 1) Chris Guida's "garbageman" branch: https://github.com/chrisguida/bitcoin/tree/garbageman, first presented at the btc++ mempool edition (2025) hackathon 2) https://github.com/chrisguida/bitcoin/commit/e9a921c045d64828a5f0de58d8f2706848c48fd2?s=09 3) https://petertodd.org/2016/block-publication-incentives-for-miners -- https://petertodd.org 'peter'[:-1]@petertodd.org -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/aDWfDI03I-Rakopb%40petertodd.org.