This idea is not similar to the one in the knapsack paper because this one is based only in the computational complexity of finding partitions that match the outputs. However, and except in rare cases, there is only one valid partition (solution) for each output, it doesn't introduce any ambiguity. Knapsack, on the other hand, splits the original outputs in such a way that there are many partitions (solutions) that match the a selected group of outputs. For example, imagine 7 people decide to participate in a coinjoin transaction with an arbitrary number of inputs (no more than 7 inputs each), imagine this is not a pay to yourself cj tx but a pay to someone else cjtx instead such that there are at most 2 outputs for participants (payment output and change output) in this case, configuring the partitions search algorithm to restrict the search space to sets of 7 inputs maximum and 4 outputs maximum it found 14,599 valid transactions in 42mins 18secs https://raw.githubusercontent.com/lontivero/Knapsack/master/data/knapsack-7-participants.txt

The same simulation with 8 participants under the same conditions found 35,781 valid transactions in about 4 hours. Finally, with 9 participants I let it running all the day and it didn't finished. The point is that the number of valid transactions grows so incredible fast that with 100 participants even if you find a way to find all the partitions that matches a set of outputs (something near to impossible), there are no way to know which of those are the real ones.

Also, the attacks on this mechanism look so simple that generate doubts. Finally, I think the numbers in this proposal look weird because the example is using 10 inputs and the amounts are in the "neighborhood of ~0.1btc" (what the heck does that mean?) and the sum of those are around 1btc. That means that it could work in a very specific scenario. Knapsack is a general solution with good math behind and backtested against historical data extracted from the bitcoin's blockchain.

In summary, in unequal inputs/outputs coinjoins knapsack is the best we have at the moment (btw, it is not as effective as equal-outputs transactions). This proposal is imo inferior and it is not supported by good math.


El vie., 27 dic. 2019 a las 22:29, nopara73 via bitcoin-dev (<bitcoin-dev@lists.linuxfoundation.org>) escribió:
The CashFusion research came out of the Bitcoin Cash camp, thus this probably went under the radar of many of you. I would like to ask your opinions on the research's claim that, if non-equal value coinjoins can be really relied on for privacy or not.

(Btw, there were also similar ideas in the Knapsack paper in 2017: https://www.comsys.rwth-aachen.de/fileadmin/papers/2017/2017-maurer-trustcom-coinjoin.pdf ) 

https://github.com/cashshuffle/spec/blob/master/CASHFUSION.md#avoiding-amount-linkages-through-combinatorics  

I copy the most relevant paragraphs here:

  ---------BEGIN QUOTE --------- 
 

Consider a transaction where 10 people have each brought 10 inputs of arbitary amounts in the neighborhood of ~0.1 BCH. One input might be 0.03771049 BCH; the next might be 0.24881232 BCH, etc. All parties have chosen to consolidate their coins, so the transaction has 10 outputs of around 1 BCH. So the transaction has 100 inputs, and 10 outputs. The first output might be 0.91128495, the next could be 1.79783710, etc.

Now, there are 100!/(10!)^10 ~= 10^92 ways to partition the inputs into a list of 10 sets of 10 inputs, but only a tiny fraction of these partitions will produce the precise output list. So, how many ways produce this exact output list? We can estimate with some napkin math. First, recognize that for each partitioning, each output will typically land in a range of ~10^8 discrete possibilities (around 1 BCH wide, with a 0.00000001 BCH resolution). The first 9 outputs all have this range of possibilities, and the last will be constrained by the others. So, the 10^92 possibilies will land somewhere within a 9-dimensional grid that cointains (10^8)^9=10^72 possible distinct sites, one site which is our actual output list. Since we are stuffing 10^92 possibilties into a grid that contains only 10^72 sites, then this means on average, each site will have 10^20 possibilities.

Based on the example above, we can see that not only are there a huge number of partitions, but that even with a fast algorithm that could find matching partitions, it would produce around 10^20 possible valid configurations. With 10^20 possibilities, there is essentially no linkage. The Cash Fusion scheme actually extends this obfuscation even further. Not only can players bring many inputs, they can also have multiple outputs.

---------END QUOTE ---------
--
Best,
Ádám
_______________________________________________
bitcoin-dev mailing list
bitcoin-dev@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev