On Tue, Feb 4, 2014 at 10:04 AM, Peter Todd <pete@petertodd.org> wrote:
>
> On Tue, Feb 04, 2014 at 04:17:47PM +0100, Natanael wrote:
> > Because it's trivial to create collisions! You can choose exactly what
> > output you want. That's why XOR is a very bad digest scheme.
>
> You're close, but not quite.
>
> So, imagine you have a merkle tree, and you're trying to timestamp some
> data at the bottom of the tree. Now you can successfully timestamp the
> top digest in the Bitcoin blockchain right, and be sure that digest
> existed before some time. But what about the digests at the bottom of
> the tree? What can an attacker do exactly to make a fake timestamp if
> the tree is using XOR rather than a proper hash function?
>

Given a tree like:

      G
     / \  
    E   F
   / \
  C   D
 / \
A   B

Where G is the root hash and A is the legitimate data that was included in the tree, the legitimate user provides B, D and F along with A to prove A is part of the tree G.

Now an attacker could just make up an arbitrary set of values that XOR together into G, like:

  G
 / \
Z   Y

And could therefore claim Z is part of tree G by providing Y. But if A is also trying to prove its a part of G, we know the first level of the tree must be E and F. It cannot also be Z and Y, so one of the two users is lying and the deceit is obvious, though not obvious which user is lying.

An attacker could look more convincing by using the data passed with A as a starting point:

        G
       / \  
      E   F
     / \
    /   \
   /     \
  C       D
 / \     / \
A   B   Z   Y

Instead of working off of G, work of the lowest branch provided by A in its verification (D, in this case), and create the fake data Z, and calculate Y such that Z XOR Y == D (which is just Z XOR D). Now the attacker can claim Z is part of G by supplying Y, C, and F. The tree looks valid (it can coexist with the proof provided by A, at least until someone else claims to be a descendant of the D node as well), and since G was verified by timestamp, looks like Z existed before that timestamp, when really it could be added at any time by calculating Z XOR D.

Brooks