Re: [bitcoindev] Reiterating centralized coinjoin (Wasabi & Samourai) deanonymization attacks

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

From: Yuval Kogman <nothingmuch@woobling.org>
Cc: Bitcoin Development Mailing List <bitcoindev@googlegroups.com>
Subject: Re: [bitcoindev] Reiterating centralized coinjoin (Wasabi & Samourai) deanonymization attacks
Date: Tue, 4 Feb 2025 15:02:01 +0100	[thread overview]
Message-ID: <CAAQdECC0FG7xhxygAPEL0XD4umU+zH84rK-P-UDKMLaZhr1HBw@mail.gmail.com> (raw)
In-Reply-To: <Z5JtilN2k7HwRRXt@petertodd.org>

This subject of this is about the possibility of active tagging
deanonymization attacks by a malicious coordinator, so I will address
the claims pertaining to that first, and the rest of the claims in
order of relevance.

There's also an important correction, due to Sjors: I overlooked a
relevant detail of BIP 341 and misremembered the related behavior when
I did consider it. The implication is that transactions that spend at
least one P2TR input partially mitigates the simplest form of this
attack, but as detailed below it's not a sufficient mitigation.

A brief summary:

1. Your proposed mitigation for the tagging attack is inadequate and
is redundant when P2TR inputs are spent.
2. Your unsubstantiated minimization of issues relating to input-input
link disclosure, in failed rounds and otherwise, can be dismissed by 3
separate lines of argument. such leaks are *very* relevant for this
attack.
3. Your account of sybil resistance ignores important nuances, but at
any rate that isn't really relevant to the subject.
4. You neglect to disclose a relevant conflict of interest which is
relevant context re (2).

Note:

At the request of the moderators I have edited out some of the additional
context I initially provided, as it is not directly related to the technical
matter at hand. While I understand why this forum is perhaps not the
appropriate one to bring that up, I stand by what I said and personally still
feel that *is* relevant, due to misinformation about the services and software
offered by for profit privacy vendors. If anyone is interested in the full
message, I have posted it publicly. Feel free to reply off list if you'd like
to read it and are having trouble finding it.

On Thu, 23 Jan 2025 at 17:25, Peter Todd <pete@petertodd.org> wrote:

# Tagging Attack by Active Adversary

> As you mention, here we have a form of MITM attack, leveraged to perform
> a sybil attack

While it's nice to see that you're acknowledging that the attack is
indeed possible, i.e. that the protocol is not trustless, your account
of how it works is somewhat misleading. The idiosyncratic use of
"sybil" is pretty common, but you go a bit further than just using it
as a convenient shorthand in your framing, confusing different attack
scenarios. This is a distraction. While sybil attacks are relevant to
deanonymization, that's qualitatively different and using "sybil"
attacks to refer to deanonymization attacks more generally obscures
the concern raised here.

So to be clear, this is not a sybil attack, where by definition the
attacker controls many apparent users. This attack is by a malicious
coordinator, and does not require it to control any of the coins used
as inputs. n-1 deanonymization attacks can be executed by means of
sybil attack, so often sybil attacks refer to n-1 deanonymization
attacks in this context, but sybils are not at all the mechanism
involved in this attack on anonymity.

## Attack Overview and Adversary Capabilities

Here's a more detailed and accurate description of how the attack is
performed, which at Sjors' request includes more details and
preliminaries about the transport layer. Note that not all steps need
to be executed for this to be an attack, see the details below for the
choices available.

1. n honest (i.e. not sybil) clients query round information from a
malicious coordinator. The coordinator responds to each request with a
unique round ID.
2. Each client then registers its inputs using isolated tor circuits,
but with respect to the maliciously chosen round ID, which links all
of these requests to the initial status request.
3. The coordinator terminates the connection confirmation phase for
each round, and misleads clients into thinking the n-1 other clients
are also participating in the same round.
4. Clients register their desired outputs, and again clients are
partitioned with each output registration uniquely identifying which
client it belongs to using the per client round ID.
5. The coordinator terminates the output registration phase, providing
clients with the unsigned transaction, and they submit their
signatures.

## Tor Circuits & Connections

As background, it's important to distinguish between circuits and
connections made over these circuits. Circuits are multi hop, onion
routed end to end encrypted channels utilizing the tor relay protocol,
between a client node and a tor relay. The client directly
communicates (TCP + TLS) with its guard node (1st hop), which is
directly connected to the 2nd relay to which the client's onion
encrypted messages are forwarded, and the 2nd relay is similarly
connected to the 3rd relay, which is used as an exit node. There
are/were 1500-2000 exit nodes in operation since Wasabi 2 was released
(https://metrics.torproject.org/relayflags.html). Hidden service
connections are no longer relevant for Wasabi (or its derivatives,
e.g. gingerwallet and to a lesser extent the btcpay plugin) so will
not be discussed here, but they were relevant at the time of launch.

Once a circuit is established to an exit node, the client can ask it
to make connections - which are full duplex streams analogous to TCP
connections - to the coordinator over clearnet. These connections
(called "streams" in the relay protocol documentation
https://spec.torproject.org/tor-spec/) are multiplexed over a circuit,
and circuits are multiplexed over the underlying TCP connections to
and between relays, but over clearnet map 1:1 to TCP connections made
by the exit node on behalf of the client. TLS is employed, so the exit
node can't MITM traffic, but cloudflare with SSL termination was used
for the zksnacks coordinator and is still used for kruw's coordinator,
so thinking of "the coordinator" as adversarial includes cloudflare,
they effectively MITM all connections, and see all unencrypted
traffic.

Wasabi uses SOCKS to connect to the local Tor daemon. Tor daemon
treats any SOCKS authentication credentials (which are always
accepted) as isolation IDs: connections associated with differing IDs
may never share a circuit. The same ID may end up being mapped to
multiple circuits as necessary, as circuits are a dynamic resource.
Wasabi then uses HTTP to make requests over these connections.
Previously a bespoke HTTP 1.1 client was used, with keep-alive
explicitly requested, but now the standard dot net HTTP client is
used, with HTTP 1.1 still being preferred, and IIUC keep alive is also
used by default with this implementation.
https://github.com/WalletWasabi/WalletWasabi/blob/d8d792d339d3e467ea36eedd45f392de5ea716df/WalletWasabi/WebClients/Wasabi/WasabiHttpClientFactory.cs#L130

Initiating multiple SOCKS connections with distinct isolation IDs in
quick succession will not result in Tor building circuits
concurrently, but sequentially. Any failures in allocating circuits or
opening connections can appear as increased latency from the point of
view of the SOCKS interface, or as a dropped connection depending on
whether the failure occurs before or after stream data was sent to the
intended endpoint. The time to first byte can be on the order of tens
of seconds even on success during bursty activity.

Wasabi uses distinct isolation IDs for input registration, reissuance
and output registration requests, and a long lived one for status
polling. Wasabi also randomizes the timing of requests, but
inadequately.

## Step 1

The `GetStatus` API is repeatedly polled to obtain data about in
progress rounds, including `MultipartyTransactionState`, essentially
an event log. This includes registered inputs, the ownership proofs
and previous transaction outputs, and outputs. When I refer to the
"transcript" I mean this ordered log of events, including hashes of
all credential requests associated with the events.

These requests use the same isolation ID throughout
https://github.com/WalletWasabi/WalletWasabi/blob/b49e69d48e6f599235cc3c518c2cf8e3e9206571/WalletWasabi/WabiSabi/Client/WabiSabiHttpApiClient.cs#L16

For each round the client knows about it asks the server to only
respond with whose index is greater than a given checkpoint. When
input registration events are observed by the client, it verifies the
ownership against the scriptPubKey also included in the event data,
and ensures that the ownership proof commitment data includes the
expected round id.

The index (which is under the coordinator's control given it chooses
how many events to return in reply to each request), and the
connection & request patterns all makes it relatively straightforward
for the coordinator to maintain different views for different clients.
Furthermore, the client respects the precedence order of rounds, so
divulging other clients' round IDs as necessary is not an issue for a
coordinator attempting to cause a client to register inputs for a
specific round ID.

An initial request from a client that has no checkpoints and comes
from an isolated tor circuit at a random time can be anonymous, as it
does not reveal any linking information to the client (apart from the
potential HTTP related client version fingerprints which I will ignore
henceforth).

## Step 2

During this step the full mapping of input-input links is revealed,
because all input registration requests from a particular client will
be sent with respect to the round ID that uniquely identifies that
client. As of today, the client software does not mitigate or even
detect this at any step. This is a serious concern, see the section on
input-input linking below for discussion of why. In this section I
will focus on the how.

Since round initiation time is not known, clients repeatedly confirm
their input registration, until a final confirmation request that
issues credentials.

A malicious coordinator can affect coin selection criteria
(constraints and suggestions). In particular, by flooding more than
one round per client and offering relatively disjoint valid selections
a coordinator can ellicit information about the preferences of the
client, on top of common ownership clusters, can aid it in optimizing
for higher rates of wallet clustering by repeated disclosure attacks
(e.g. by aborting rounds). The coin selection code has no memory of
the adversarial inputs or its own selections. Deliberate
de-randomization with ad-hoc rules makes this algorithm more
susceptible to harmful choices, for example the bias towards larger
valued coins interacts with the maximum allowed and suggested values.

- https://github.com/WalletWasabi/WalletWasabi/blob/d8d792d339d3e467ea36eedd45f392de5ea716df/WalletWasabi/WabiSabi/Client/CoinJoin/Client/CoinJoinClient.cs#L179-L181
- https://github.com/WalletWasabi/WalletWasabi/blob/d8d792d339d3e467ea36eedd45f392de5ea716df/WalletWasabi/WabiSabi/Client/CoinJoin/Client/CoinJoinCoinSelector.cs#L50-L303

Unlike the common input ownership heuristic applied as applied to
confirmed transactions this can be iterated by the attacker. The
iterative nature and cluster revelation bears some resembles the
privacy leaks in BIP 37 filters discovered by Jonas Nick
(https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/155286/eth-48205-01.pdf),
as well as this more recent work which investigates clustering using
unconfirmed transactions (https://arxiv.org/pdf/2303.01012).
Qualitatively it seems to me that this is more harmful, given the
nature of the coins that are meant to be selected into privacy
enhancing transactions. The client's leniency makes focusing on input
clustering only a costless and practically covert attack, even though
technically it's an active one, so the client need not trust the
coordinator not to do it.

## Step 3

Only at this step are the input registration events published by the
server, but note that this apparent phase change can be different for
each adversarially constructed round. Although the initial input set
is finalized at once, this is represented as many discrete events in
the log, which if clients are sharing the same round ID (see step 5
for why), which affords the coordinator a tagging vector (manipulating
the checkpoint index). In addition to there being sufficiently many
indexes to identify each input uniquely, the order of these events is
not constrained, since nowhere is the transcript (see above) actually
committed to.

After this no additional inputs are allowed (incidentally this is a
perverse incentive to de-randomize input registration timing). From an
adversarial point of view this mean given a set of inputs whose real
ownership proofs commit to distinct round IDs, it can choose for each
input cluster what subset of other forged inputs it can see, and which
input (clusters) to reject, for example by aborting with "not enough
participants" or "load balancing" conditions, which are considered
normal (but are at least logged).

For an adversarially chosen input set, which can contain entirely made
up coins with arbitrary values, real ones controlled by the adversary
if the coordinator is also doing a sybil attack, chosen in order to
influence output registration decisions, there is no restriction on
the same set of outpoints to be used unless the coordinator wishes to
also obtain output registrations or signatures.

## Step 4

Here the client reveals its intended output registrations, linking
them to the already linked inputs (i.e. the full sub-transaction of
that client). Inputs signal their readiness to sign. When all inputs
have signalled, the output registration phase terminates. Here too
there is room for partitioning clients in a more covert variation,
since all individual clients know is how many of their inputs have
signalled readiness to sign. This information is not included in the
`MultipartyTransactionState` event log, but should be part of the
transcript.

## Step 5

At this stage the client calculates the final transaction by sorting
the inputs and outputs from the event log, first by amount then
outpoint or scriptPubKey.

The coordinator can freely ignore/censor certain signatures (valid or
invalid ones), and trigger a blame round. In blame rounds the allowed
outpoints must be a subset of the previous round's outpoints. By
equivocating the `BlameOf` field of the round parameters in
conjunction with the other tagging vectors (i.e. not unique round
IDs), clients can be misled into believing the same blame round ID is
associated with more than one previous round ID, because his field is
not hashed when deriving round IDs:
https://github.com/WalletWasabi/WalletWasabi/blob/d8d792d339d3e467ea36eedd45f392de5ea716df/WalletWasabi/WabiSabi/Crypto/RoundHasher.cs#L13-L32

P2WPKH inputs sign according to BIP 143, where each signature commits
to the list of outpoints. P2TR outputs sign according to BIP 341,
which also commits to the scriptPubKeys and amounts, which are taken
the event log. This means that taproot inputs will not produce valid
signatures if they observe equivocated ownership proofs. Thanks to
Sjors for this correction, I misremembered this as only committing to
the values of the coins. No such restriction exists for P2WPKH
outputs. This restricts a malicious coordinator ability to tag using
equivocated round IDs when P2TR inputs are spent, but this does not
preclude tagging those outputs by other means (i.e. checkpoint leak,
differential timing as discussed in more detail here
https://github.com/WalletWasabi/WabiSabi/issues/83 and exploiting soft
aborts to allow unbounded iteration of these attacks) nor the
interaction with the `BlameOf` field in the round ID derivation, see
below for details.

In other words even though taproot signing prevents equivocated
ownership proofs from being used to successfully tag a fully signed
transaction if it contains at least one P2TR input, it's possible to
use tagging to reveal the input clusters, and then restrict subsequent
blame rounds to only the already tagged inputs, and only permit
transactions which the adversary is confident sufficiently link the
outputs to the input clusters to succeed.

## Issues with Peter's Mitigation

> Since Wasabi already has coin ownership proofs and distributes them, I
> believe we can easily validate the round ID consistency of a coinjoin
> transaction after it's fully signed and confirmed by simply validating
> that the signatures on the inputs were in fact signed by the pubkeys of
> the corresponding coin ownership and round ID proof.

This mitigation, which is redundant with BIP 341 signing, is strictly
inferior to the 4 possible mitigations already mentioned in this
thread, none of which were implemented:

1. redundant queries of round information over isolated tor circuits
2. best effort validation of ownership proofs under SPV assumptions
3. deterministic shuffling of transaction data seeded by transcript hash
4. have the coordinator publicly sign the transcript before signatures
are submitted

Unlike (1) and (2) this check is not done before proceeding with the
protocol, and unlike (3) and (4) it is not done before aggregating
signatures, and so does not prevent going through with a maliciously
constructed transaction. The victim pays for the attack (mining fees
and optionally contributing to coordinator revenue), and only
discovers it occurred after the fact. See below for some additional
consequences of this flaw.

Unlike (3) and (4), which can confirm the full transcript, this only
checks consistency of the round ID. That makes sense for (1) and (2),
but not for a check done after the fact.

Unlike (4), a public signature is non-repudiable making it possible to
prove that a malicious coordinator equivocated. With the aid of a 3rd
party, something analogous to (1) can be used to prove equivocation,
assuming the 3rd party is trusted by the verifier of that claim
(something like TLS notary). a variant of this was considered during
the design, but without taking into account waxwing/AdamISZ's insight
about the non-repudiation property.

Finally, as mentioned above for P2TR outputs this mitigation is
already redundant with the hashing specified in BIP 341. That has an
effect similar (3) but unfortunately does *not* commit to the full
transcript but only the set of ownership proofs. Like mitigation (3)
this happens too late in the protocol, and without additionally
implementing mitigations (1) and/or (2), is insufficient.

(1) and (2) are not protocol breaking changes. I previously
erroneously described a weak form of (3) as possible, because I
misremembered this as only performing a stable sort by amount, but
that is not the case as they are explicitly sorted by outpoint /
script, so unfortunately that makes mitigation (3) a protocol breaking
change.

> The only question left for this technique is a cryptography one:

No. The question, and far from the only one, is why were none of the
stronger and fully described (before mainnet release) mitigations
implemented at launch or at any subsequent point, despite repeated
claims of the protocol being trustless?

- https://web.archive.org/web/20240521010051/https://blog.wasabiwallet.io/wasabi2-0-released/#:~:text=that%20implements%20trustless%20WabiSabi%20coinjoin%20over%20the%20Tor%20anonymity%20network.
- https://archive.is/dtPd9
- https://archive.is/Vi46a
- https://archive.is/vexOP
- https://archive.is/l5Qko (two lies in one tweet, cloudflare was a
trusted 3rd party and still is with kruw's coordinator)
- https://bitcointalk.org/index.php?topic=5482818.msg64055886#msg64055886

Why were the concessions with regards to section 7.2.2 of the paper
not justified? Why did two of the named authors not not disown or
retract the paper if they don't stand by its contents? Especially as
they were representatives of a for profit coordinator?

https://eprint.iacr.org/2021/206.pdf

At least I agree with the framing of this as an essentially solved
problem, just with respect to the proper mitigations, which just makes
their omission from the implementation even more baffling.

> Is it possible to create an alternate pubkey p', that such that a valid
> signature s signed by arbitrary pubkey p for message m, also validates
> for p' for signature s and message m?

Malleability of the signature or public key under weak fiat shamir
(i.e. ECDSA) is irrelevant, as waxwing points out in his reply the
commitment data is strongly bound by the challenge hash. The public
keys bind that commitment  to an unforgeably costly resource (UTXOs),
and for all relevant output types this key is still strongly bound by
the scriptPubKey (P2{W,}PKH) and so also made available in the
ownership proof data (since BIP 322 proofs have the same format as
transaction signatures) and the final signature data.

Waiting for confirmation merely proves that either this was consensus
valid, or that the attacker was willing to expend PoW on it although
that too is irrelevant since Wasabi doesn't actually validate
difficulty / chain work unless it is configured to check against a
trusted node, in which case mitigation (2) would provide complete
certainty.

# Input-Input Linking, Consequences for On Chain Privacy

> Clients do *not* need to validate coins in coinjoin rounds. Whether or
> not they're real is irrelevant, as the Bitcoin consensus itself
> validates this for you.

This is only true if one ignores *all* aspects of the protocol apart
from liveness, which the coordinator is trusted with anyway (which is
OK, unlike trusting it with respect to privacy, "semi honest" threat
model) and security against theft which excluding the concern for
stateless signers (which if the ownership proofs are faked, a
stateless signer would not recognize as its own) is at least verified
by clients. See below on the topic of light clients.

However, this does not follow with regards to privacy. You write:

> Secondly, there's a potential class of attacks via failed rounds;
> attacks via failed rounds are potentially cheap/free attacks, as no
> transaction fees are actually spent.
...
> A Wabisabi coordinator gets desired
> inputs and outputs from clients, which would allow them to learn
> something about which inputs are linked to each other, and which outputs
> were desired by the owner of those inputs, if the coordinator
> succesfully "sybil" attacks the client.
>
> This class of attack might be interesting if Wasabi reused outputs after
> rounds failed, in subsequent rounds.

Intersection attacks, first introduced in the context of mixnets, have
been considered in the context of coinjoins since at least 2014
(https://people.cs.umass.edu/~gbiss/mixing.pdf) and applied to
coinjoins and wallet clustering since 2017
(https://petsymposium.org/popets/2018/popets-2018-0038.pdf). This work
is not explicitly cited in the wabisabi paper as the on-chain aspects
were considered out of scope in the paper, which is only concerned
with the KVAC DoS control mechanism and its privacy consequences. That
you don't find it interesting without any further justification is,
most charitably, a reflection of the depth of your research into this
subject.

Concerns about intersection attacks, especially as described in the
cookie meets the blockchain paper (i.e. use of out of band leaks that
can't be accounted for in anonymity set estimates based only on
on-chain data) are justified if the coordinator is only considered
semi-honest (trusted with liveness, not with privacy).

As a concept, intersection attacks are also taken for granted in the
code when estimating anonymity set sizes:
https://github.com/WalletWasabi/WalletWasabi/blob/d8d792d339d3e467ea36eedd45f392de5ea716df/WalletWasabi/Blockchain/Analysis/BlockchainAnalyzer.cs#L161-L183

Of course, as the cookie meets the blockchain paper discusses,
intersection attacks are part of a broader family of clustering
attacks. Note especially the caveats in the paper about limitations of
the result when clusters are disjoint, making the input-input links
particularly relevant for an attacker attempting to cluster coinjoin
outputs. Intersection attacks both rely on and are amplified by other
wallet clustering techniques. The Wang et al paper linked above, which
discusses clustering based on unconfirmed transactions, includes a
useful and fairly up to date survey of these techniques. Other recent
notable work is the cited Kappos et al
(https://www.usenix.org/system/files/sec22-kappos.pdf) and Moser &
Narayanan's recent work (https://arxiv.org/pdf/2107.05749) both of
which improve clustering by utilizing wallet fingerprinting.
Fingerprinting is another subject that has seen recent progress
recently (https://ishaana.com/blog/wallet_fingerprinting/).

Given the adversarial control over aborts, the more minor tagging
vectors, the lack of binding of the `BlameOf` field and the gamable
coin selection a strategic malicious coordinator can manipulate
clients into only completing successful rounds when it is able to
successfully cluster the outputs even if reliant on the round ID
equivocation mechanism, as must be the case for taproot inputs.

So to put it bluntly, because there is no other way to put it, your
theory of "anonymity set size" as calculated by wasabi is simplistic,
and ignores the vast literature on privacy such as work on mixnets,
and the entropic anonymity set perspective (the way deanonymization
attacks generally compound typically resulting in an exponential decay
of privacy). This all has been well understood for more than 2
decades, and applies doubly so in Bitcoin's case due to the
transparent and highly replicated nature of the transaction graph, as
demonstrated by the small selection of papers I've shared here.

Secondly, there's the matter of what was specified. Leaking
information about input-input links is discussed in the wabisabi
paper, mainly discussed in section 7.2.1 and mentioned several times
throughout. Just two quotes for your consideration:

>> In order to maintain privacy clients must isolate registration requests
>> using unique network identities. A single network identity must not expose
>> more than one input or output, or more than one set of requested or
>> presented credentials.
...
>> Since every registration request is only associated with a single input
>> or output, the only information that would need to be broadcast publicly on
>> the blockchain if the protocol succeeds is revealed directly.

If this is justifiably not "interesting", then again, why are two
named co-authors of the paper evidently unable to justify lack of
consideration for this subject when it was time to implement the
protocol they are supposed to understand? Throughout all of the
references I provided, including the paper, there is not one
substantiated counter argument to *anything* of what I said: no
serious errors, no arguments against the assumptions, no refutation or
even discussion of the prior work which I had studied to substantiate
my claims.

That this was included in the threat model is again supported by the
code, which as we know uses a distinct isolation ID for each input
registration request. This is a rather expensive (in terms of latency,
failure rates, etc) implementation choice, and one which is seemingly
important (though not enough to merit an investigation how tor
circuits actually work, as can be seen here
https://github.com/WalletWasabi/WalletWasabi/issues/8420). The fact
that isolating circuits is insufficient to prevent leaks is clearly a
bug, otherwise a single isolation ID would have been considered.

Thirdly, it has been claimed by support staff and the official account
that the coordinator *can't* learn this information:

- https://archive.is/dtPd9
- https://archive.is/8xtLW
- https://archive.is/pyyuu
- https://archive.is/rpRIo

So even though you don't find it "interesting", apparently they seemed
to think that paying customers of the service do.

In other words there are 3 separate reasons to dismiss your claim that
this is not interesting:

- published and well cited research on this concern (with supporting
evidence in the code that this is in scope for wasabi's threat model)
- claims made in the wabisabi paper (again, the code agrees up to
implementation flaws)
- marketing and support claims made to paying users

And just to reiterate, the adversary isn't just the coordinator
operators, but also cloudflare, which as mentioned above, since as a
MITM it can surveil and censor any traffic between the coordinator and
the client.

# Ownership Proofs w.r.t Light Clients

Returning to light clients, not that it's that important but mainly
because you misconstrue and/or misrepresent my views and past
statements on this.

> I've been reviewing Wasabi and other coinjoin implementations lately and
> I believe that your focus on lite clients with regard to Wasabi is
> incorrect.
...
> You keep bringing up lite clients, e.g. in your statement that:
...
> Your focus is mistaken. In fact, it's irrelevant whether or not a txin
> in a proposed coinjoin round is spending a coin that exists or not. The
> reason is there are two possible situations:

Your framing of my "focus" being "incorrect" is, apart from being
unsubstantiated, also disingenuous.

Wasabi is a light client, except when used in the aforementioned
hybrid mode where it's configured to query certain information from a
trusted full node. The quote you gave is explicitly talking about the
difficulty of validating ownership proof *before* the round commences,
to avoid disclosing any actions to a malicious coordinator. It was
made under the assumption that at least wasabi clients configured to
use a full node would protect against these issues. They don't, with
no reasonable explanation given other than that the full node
integration is "not real", which for reasons I'm yet to understand
allows making some RPC requests but precludes others (which the
backend is able to make
https://github.com/WalletWasabi/WalletWasabi/blob/d8d792d339d3e467ea36eedd45f392de5ea716df/WalletWasabi/WabiSabi/Backend/Rounds/Arena.Partial.cs#L345C33-L345C46).

When I said what you quoted out of context I was working under the
assumption that one of the most basic aspects of trustlessness, namely
lack of mitigation against tagging attacks which are the subject of
this thread, would be addressed before the release. I don't know why
you seem to insist it's my "focus", when it's just a constraint
imposed by the project. Perhaps the reason is that you're confused
about what the ownership proofs actually were meant to provide
assurances against. Let's recap.

Note the opening paragraph of the issue you quote from, emphasis added here:

>> Clients need to verify ownership proofs to ensure *uniform credential
>> issuer parameters for all round participants (otherwise they might be subject
>> to tagging attacks)*, and to prevent denial of service by a malicious
>> coordinator (being tricked into creating invalid coinjoins that can't be
>> broadcast, *but which may leak information about intended outputs*).

This is separate from the concern of the coordinator validating user's
ownership proofs to protect against denial of service attacks by
malicious users, attacks on stateless signing devices (as I already
mentioned, the only reason ownership proofs were eventually given to
clients: at the time of the mainnet release only the coordinator saw
them, precluding even consistency mitigations reliant on the use of
more than one input per client) as described here
https://gnusha.org/pi/bitcoindev/CAB3F3Dv1kuJdu8veNUHa4b58TvWy=BT6zfxdhqEPBQ8rjDfWtA@mail.gmail.com/

However, as alluded to by the 2nd emphasis a primary concern with the
ownership proofs in addition to consistency of the round parameters
and transcript, is that prevout amounts are critical information for
making choices about the choices of output values. Poor output
denomination choices are a potentially catastrophic privacy leak
especially in conjunction with input-input links (addressed above)
under the sub-transaction model
(https://www.comsys.rwth-aachen.de/fileadmin/papers/2017/2017-maurer-trustcom-coinjoin.pdf)
when considering P_{O,O}, P_{I,O} values associated with a particular
output. In ginger wallet not only are these apparent prevout values
potentially under adversarial control, the coordinator simply
explicitly tells the client which denominations to use, another
unnecessary trust assumption.

This is elaborated on in this issue from which I will quote some more:
https://github.com/WalletWasabi/WalletWasabi/issues/5945

>> Clients must opportunistically verify as many ownership as possible with
>> already available blocks, and additionally verify at least one or two
>> random ownership proofs, ideally more. Full nodes can verify all ownership
>> proofs trivially, with spend status validation as well, but light clients
>> with no full node integration enabled can only check for inclusion in a
>> block (see #5533).
..
>> With some verification the confidence that the amounts given by the
>> coordinator is valid increases, but is not certain without, so it is still
>> possible for a coordinator to partition users and hide some inputs from some
>> users to bias their actions.

Without knowing the set of other users' input values, the wallet's
balance may be decomposed into a combination of output values that is
not underdetermined in the sub-transaction model, as I mention here:
https://github.com/WalletWasabi/WalletWasabi/pull/5994#issuecomment-924105739
Also note that this follows discussion of mitigations (1) and (2) and
the lack of header validation are discussed here with no substantive
defense of the lack of such mitigations, and what exactly is assured
in the light client security model (but thank you for explaining it as
though i'm confused).

# Sybil Attacks

> # Sybil Attacks In General
>
> Let's get this out of the way first.

This is also addressed in the paper, section 7.2.2. Have you read it?

> As AdamISZ correctly noted in his
> Jan 7th reply¹ to you, sybil attacks in general are impossible to
> entirely prevent in any coinjoin protocol where participation is done
> anonymously. It is always possible for an adversary to simply flood the
> mechanism with coinjoin requests to the point where they are the only
> counterparty.

waxwing brought up *targetted* attacks in relation to the
coordinator's ability to censor honest parties, a distinction which
you are glossing over here.

> What we can do is make sybil attacks costly. In general, Wasabi's
> current usage with user-settable centralized coordinators does that
> pretty well: typical coinjoin rounds on the most popular coordinator,
> https://coinjoin.kruw.io, are transactions close to the standard size
> limits, with hundreds of inputs and outputs, mixing millions of USD
> dollars worth of BTC per round. A trivial sybil flood attacker would
> have to spend a lot of money and hold a lot of coins to simulate that.

Targeted attacks are inherently cheaper. In this setting the
coordinator can try to influence the targeted user in the numerous
ways described above, and only let the transaction go through if
circumstances favor deanonymization. You yourself mention the failure
rate for rounds is high, though it's not clear if you referred to
JoinMarket or wasabi. In wasabi's case, they are much higher than they
need to be for reasons described in previous messages. There are also
incentive incompatibilities inherent in the protocol design (e.g.
https://github.com/WalletWasabi/WalletWasabi/pull/6654) that make it
rational for honest users to defect under some circumstances, and
indeed the client implements defection behavior in places that could
be prevented before comitting to a round
https://github.com/WalletWasabi/WalletWasabi/pull/7216

Secondly, coordinator revenues also need to be considered (also
discussed in the paper) in the context of sybil attacks. Notably,
kruw's coordinator earns revenues despite being advertised as "free",
and with mining fees being low recently the mining fee cost is about
as low as it can be, so really the liquidity requirement is the only
deterrence under these circumstances even for non-targeted attacks.

Bolstering this deterrent significantly is straightforward (verifiably
randomization of selected inputs), but that too was rejected by the
wasabi team, despite discussions about the value and simplicity of
this approach and its benefits over samourai's 'trust me bro it's
random" input selection. This was not fully documented, but is
mentioned in passing here (discount mechanism to incentivize honest,
low time preference users to impose a higher liquidity requirement by
earning discounts, no negative externalities to honest high time
preference users)
https://github.com/WalletWasabi/WalletWasabi/issues/5439

# Address reuse avoidance

Anyway, the final technical thing you brought up, the address gap:

> I have not verified whether or not
> this is actually true;

Wasabi uses a non standard address gap limit (114 by default), and
retains information in persistent storage about the state of keys (see
KeyState enum).

This is problematic with recovering from seed,  because upon recovery
the last key used on chain will determine the next index to use. The
last key used in a failed round is not known.

> This is an
> implementation issue due to gap-limits in HD wallet implementations;

This is a privacy issue due to the dependence on off-chain state and
the inability to restore it when recovering from seed.

> Silent Payment-like functionality may be a way around this problem.
> Additionally, this class of attack would impact pay-in-coinjoin
> functionality, where a payment address is added directly to a coinjoin.

With silent payments avoiding this reuse can be done statelessly under
the assumption that input sets never repeat (which they shouldn't if
the coordinator is honest and is very unlikely even if it isn't). This
does not require full receiver support at the cost of a linear scan of
the wallet's full transaction history upon recovery, discovering root
transactions starting from keys (e.g. from a normal external address
chain), if the silent payments derivation is only used for self spend
outputs.

This assumes the private scanning key is accessible (which it should
be) but requires no protocol modifications (whereas sending to 3rd
parties, or more generally without knowledge of the private scanning
key, requires cooperation of the other parties and requires blinding
to preserve privacy). With the private scanning key self spend
addresses can be computed using silent payment derivations using the
public spending keys obtained from the ownership proofs, allowing the
script pubkeys to be known for output registration.

Unfortunately this can lead to unspendable outputs, because without
strong validation of ownership proofs: if the final transaction does
not match the keys in the ownership proof then the data required for
computing the tweak is not available from the confirmed transaction.
As discussed above, if P2TR inputs are spent into the transaction
consensus validation protects against this, but P2WPKH only ones would
not, nor would their owners know their self spend tweaked outputs are
safe (spendable using only BIP 32 seed & BIP 352 scanning rules) at
the time of signing.

Under the active tagging scenario, either the tweaked private key or
the malicious ownership proofs need to be durably persisted in order
to maintain access to funds. This is not a viable solution without
validating the consistency of the ownership proofs *before* signing,
which your proposed mitigation fails to account for (but which
mitigation (3) protects against for all input types, and mitigation
(4) also protects against with the additional assumption that other
clients are honest, in both cases an honest coordinator that refuses
to accept signatures that don't match the ownership proofs eliminates
this issue).

# Conflict of Interest

It would have been appropriate for you to disclose that your review is
paid for by an interested party as a direct response to accusations I
have made:

- https://archive.is/cbffL
- https://archive.is/BJCNG

Kruw has described his service as "free" and "trustless", despite earning
revenues and despite the issues described here. Supporting evidence for
this is in the unedited version of this reply.

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAAQdECC0FG7xhxygAPEL0XD4umU%2BzH84rK-P-UDKMLaZhr1HBw%40mail.gmail.com.

next prev parent reply	other threads:[~2025-02-04 14:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-21 14:16 [bitcoindev] Reiterating centralized coinjoin (Wasabi & Samourai) deanonymization attacks Yuval Kogman
2025-01-06 13:07 ` Sjors Provoost
2025-01-06 14:30   ` Yuval Kogman
2025-01-07 15:56     ` waxwing/ AdamISZ
2025-01-07 21:33       ` Yuval Kogman
2025-01-23 16:25 ` Peter Todd
2025-01-24 16:00   ` Peter Todd
2025-01-24 16:38   ` waxwing/ AdamISZ
2025-02-04 14:02   ` Yuval Kogman [this message]
2025-02-04 22:22     ` Peter Todd
2025-02-07 20:07       ` Yuval Kogman
     [not found]         ` <sqPb0Ljo2YteBE3rTHUfnrdHihbV9UnZjM4Q7tfqYzDuqsGZbHcaqJnU9LYwN7_iaqIO9B-FVAx3aXRyuDh1TnzZ-Mnp_2vRC4JblnvN1O4=@protonmail.com>
2025-02-13 15:42           ` Yuval Kogman
2025-02-12 10:17       ` /dev /fd0
     [not found]   ` <CAAQdECD9MfVqU=BLgRpUnEMa=m0cnGj4SWCcviKzpRYJktMaNA@mail.gmail.com>
2025-04-04 16:42     ` Peter Todd
2025-04-04 17:58       ` Yuval Kogman
2025-04-04 19:59         ` Yuval Kogman
2025-04-07 10:01       ` Javier Mateos
2025-04-09  2:16         ` Yuval Kogman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAAQdECC0FG7xhxygAPEL0XD4umU+zH84rK-P-UDKMLaZhr1HBw@mail.gmail.com \
    --to=nothingmuch@woobling.org \
    --cc=bitcoindev@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox