Hi List,

I recently presented a poster at the Financial Cryptography conference '2020 which you can find here: https://github.com/LLFourn/taproot-ggm/blob/master/main.pdf.  It attempts to show the security requirements for the tweak hash function in Taproot. In this post I'll give a long description of it but first let me tl;dr:

Taproot requires no new assumptions of SHA256 over what are already made by Schnorr signatures themselves with one exception: when using a non-interactive key generation protocol to produce a Taproot internal key (e.g MuSig). To prove security in this scenario we need a make an additional assumption about SHA256: as well as being collision resistant (i.e. find two hashes h_1 - h_2 = 0), it must satisfy a more general kind of collision resistance where it is hard to find h_1 - h_2 = d for *any d* when the adversary is challenged to find h_1 and h_2 with random prefixes. This is obviously a plausible assumption. Put informally, it says that zero is not a special case where finding collisions is difficult but rather solving the 2-sum problem is hard for all values of d (when challenged with random prefixes).

Now the long version.

My motivation for creating this poster came from questions I had after discussions in Taproot Study Group #18 (this study group initiative was a great idea btw). The main question I had was "Why is Taproot binding?" i.e. why is it true that I can only commit to one Merkle root. Isn't it possible that a malicious party could produce a second covert Taproot spend that none of the other parties to the output agreed to? I submitted a poster proposal to FC to force myself to get to the bottom of it. 

The premise of the poster is to use the Generic Group Model to try and figure out how the hash function would have to fail for Taproot to be insecure. Most of the poster is taken up cartoon reductions I made to remind myself as to why what I was saying might be true. They are incomplete and difficult to parse on their own so hopefully this post is a useful companion to them.

=== The Security of Taproot ===

There are three scenarios/games we must consider when asking whether Taproot is secure in the context of Bitcoin:

1. Taproot Forge: Forging taproot spends must be hard. The adversary must not be able to take a public key off the blockchain and produce a forged Taproot spend from it.
2. Covert Taproot: When an adversary is executing a multi-party key generation protocol (e.g. MuSig) it should be hard for them to produce a covert malicious Taproot spend from the joint key  i.e. when honest parties think there is no Taproot on a key there shouldn't be any Taproot on the key. Note this is not guaranteed to be hard by 1 being hard.
3. Second Covert Taproot: Like 2, except that if honest parties agree to a Taproot spend then the adversary shouldn't be able to generate a second Taproot spend they are unaware of.

Properties (1) and (2) can be argued succinctly if we just prove that Taproot is a secure commitment scheme. It should be clear that if a Taproot external key T = X + H(X||m)*G is a secure commitment scheme (Hiding and Binding) to any arbitrary message m, then it is a secure commitment scheme to a Merkle root. If so, then properties (1) and (3) hold. (1) holds because if you can create an opening to a commitment not generated by you, you either broke hiding (if your opening is the same as the honest one) or broke binding (if it's different). (3) holds because you must have broken binding as there are now two openings to the same commitment.

Property (2) is more difficult to argue as it depends on the multi-party key generation protocol. Case in point: Taproot is completely broken when combined with a proof of knowledge key generation protocol where along with their public keys each party provides a proof of knowledge of the secret key. Where X_1 is the key of the honest party, the malicious party can choose their key X_2 to be G*H(X_1 || m) where m is a malicious Merkle root. Clearly the malicious party has a covert Taproot for X = X_1 + X_2 and can produce a proof of knowledge for X_2.

Given this definition of security, we now move onto how we should model the problem to prove they hold.

=== Generic Group Model vs Random Oracle Model ===

For practical cryptographic schemes you often have to idealise one of its components to prove it secure. The most popular candidate for idealisation is the hash function in the Random Oracle Model (ROM), which idealises a hash function as a "random oracle", a black box which spits out random values for each input. For example, the original "forking lemma" proof by Pointcheval and Stern [1] shows the Schnorr signature scheme is unforgeable in this model if the discrete logarithm problem is hard. In other words, idealising the hash function allows us to isolate what security assumptions we are making about the group (e.g. the discrete logarithm problem being hard in it).

But what if we want to know what assumptions we are making about the hash function? Does the challenge hash in Schnorr signatures have to be collision resistant or pre-image resistant or something else? To answer this question Neven et al.[2] analysed Schnorr signatures by idealising the group in the "Generic Group Model" (GGM). By idealising the group, they were able to isolate the security requirements of the hash function away from any assumptions being made about the group. In the GGM, the group becomes a black box which, when given two group elements, spits out their subtraction (for technical reasons it's subtraction rather than addition). The adversary can only produce new group elements by querying the oracle. Using the GGM they prove that the hash function needs to be Random-Prefix Preimage (RPP) resistant (and Random-Prefix Second-Preimage resistant) which are strictly weaker assumptions than collision resistance.

=== Taproot in the Random Oracle Model ===

Proving that Taproot is a binding commitment scheme in the ROM is straightforward (hiding is too but I'm going to skip that). To produce two openings for the same external key, the adversary must have two random oracle queries H(X || m) that result in the same external key T = X + H(X||m)*G. Since H(X||m)*G is an (almost) uniformly distributed group element in the ROM, T is also uniformly distributed, thus breaking the binding of Taproot is equivalent to solving a birthday problem of size 2^256 (the same as finding hash collisions in the ROM). Note that this statement is true regardless of the discrete logarithm problem being hard or not. This proves properties (1) and (3).

For property (2) let's consider MuSig as the key generation protocol. If we model the MuSig key tweak hash function as a random oracle as well then for every key X_2,  the adversary has to query the MuSig hash oracle to determine the joint key X = X_1*H(X_1||L) + X_2*H(X_2| L). As before, it is clear to see that this makes X a uniform group element for every X_2 in the ROM. Liekwise for every covert Taproot internal key C and message pair the external key T = C + H(C||m) *G will be uniform as before in the ROM. Thus, breaking property (2) is the same as finding T = X, where you the adversary can only sample T and X from uniform distributions and so we have another birthday problem. This completes the proof of all three properties.

Poelstra presented a proof in the ROM for the security of Taproot [3]. It frames Taproot as a way of combining two signature schemes into one public key (in our case Schnorr and Tapscript). He uses a similar line of reasoning to what I have just presented in his proof (Lemma 1, step 3) but this approach brings in many other considerations that I think can be avoided by modelling it as a commitment scheme. Note that this proof only shows that Taproot forgeries are hard i.e. property (1).

=== Taproot in the Generic Group Model ===

The ROM proof is an important first step -- if it couldn't be proved secure in ROM then it would probably be completely broken. But Taproot, unlike Schnorr, only relies on the security of its hash function when viewed as a commitment scheme so it would be prudent to figure out what those properties are. By using the ROM we artificially hide what those properties from our analysis. As in the case of Schnorr, we can idealise the group in the GGM to help isolate the hash function's properties.

To prove Taproot was a binding commitment scheme in the GGM I had to introduce a new property I called "Chosen Offset Prefix-Collision" (COPC) resistance. The precise security game is sketched in the poster, but I like to describe it as a more general kind of collision resistance. Instead of it being hard to find two preimages a and b where H(a) - H(b) = 0, it must be hard to find H(P_1 || a) - H(P_2 || b) = d for any d (not just d  = 0) with random prefixes P_1 and P_2 given by the challenger (d chosen by the adversary). COPC is necessary and sufficient to prove Taproot is a secure commitment scheme in the GGM (the proof for this isn't in the poster but is very similar to Second Covert Taproot proof).

This was not the ideal outcome, so I decided to analyse properties Taproot (1) and (3) independently rather than just imply them from the commitment scheme result. What ended up in the poster is three independent proofs for each Taproot security property with MuSig assumed to be key generation scheme for properties (2) and (3). Here's a summary of what I concluded for each property.

1. Taproot Forge: In the GGM, an adversary who forges Taproot openings can be used as a black box to mount a "Random Prefix-Preimage" (RPP) attack against the hash function. This is a very good result as RPP is already required by Schnorr. Essentially, this means anyone who can forge Taproot spends can also forge Schnorr signatures.

2. Covert Taproot (MuSig): For this problem I had to split the adversary into two types: those who query their MuSig public key X_2 from the group oracle before their malicious internal key C and those that query C first or set X_2 = C. For the first case I was able to show another reduction from RRP (which shown in the poster).  The other case I was able to break preimage resistance as long as I modelled the MuSig hash function as a random oracle (not shown in the poster and this is only from memory). In both cases the reduction does not work for n-party MuSig (only for 2 parties). Obviously, this is not totally satisfying. The problem with n-party MuSig is it becomes exponentially more unlikely (in n) for the reduction to guess which keys the adversary will use for their MuSig keys.

3. Second Covert Taproot (MuSig): Once again, this is where honest parties agree on a joint key and Taproot spend from it, but the adversary is somehow able to create a second covert spend during the key generation phase. This is where I found that COPC does actually need to be hard to ensure this property. This is true regardless of the number of parties. Thus this is the only scenario where you need the additional security assumption to prove security.

== Concluding Remarks ==

The main important take away of this is that there is actually a small security cost to using a group element as both a commitment scheme and as a public key. It would be very surprising if we got this for free. By using the random oracle model we merely hide this in the idealisation of the hash function. The generic group model exposes it. The question is: is the cost worth it and who bears it? Here's what I consider to be the most important points:

1. You only take on this COPC assumption if you use Tapscript. If you're just putting your funds into a Taproot output without an internal key, either as a group or an individual there is no extra security assumption. (with the caveat that my technique only really works for  2-party MuSig).
2. The COPC assumption seems to be very plausible.
3. Even if COPC is broken and an adversary can output two openings to the same external key, both those openings must be valid taproot spends for anyone to lose coins (i.e. Merkle roots with valid paths to leaves with valid tapscript).
4. Even if COPC was that badly broken on SHA256, old taproot outputs would not be affected, the adversary has to break it during key generation before funds are put into the output.
5. You can completely circumvent this result by using coin-tossing rather than MuSig for the key generation protocol. In most cases this doesn't even add any extra rounds of communication since you are doing 3-round coin tossing to choose the R values for the signatures that spend from the joint output anyway. You can just toss your public keys in parallel.

In my opinion, the cost of Taproot is mostly borne by theoreticians. They can no longer treat a a public key ideally but have to consider the implications of it also being a commitment. For the user and Bitcoin as a whole it seems to offer an overwhelming benefit. In exchange for the complexity it adds to making security claims in the GGM (if using Taprscript and MuSig), it offers exciting new opportunities for non-interactivity and fungibility over what just what Schnorr would provide.

I don't consider my work to be a final proof of anything. I would welcome anyone who wants to take over this research direction and do a proper job of it! I didn't have any personal motivation for doing this work other than curiosity and that curiosity has been satisfied. Questions and thoughts welcome :)

[1] https://www.di.ens.fr/david.pointcheval/Documents/Papers/2000_joc.pdf
[2] http://www.neven.org/papers/schnorr.pdf
[3] https://github.com/apoelstra/taproot/blob/master/main.pdf

Cheers,

LL