I think I understand what you're getting at with your first point. The thing is, to be able to include arbitrary data in the hashes provided to resolve the Merkle tree, it would require an extraordinary amount of computation to wind up with enough to store arbitrary data. And remember, this is competing with just storing that data in the witness, so it has to be 4x more economical. Consider a 1/1024 multisig, and the key being spent is at the furthest depth in the tree. This means that they would need to grind generating an elliptic curve public key in the hopes of getting a hash collision just to include 11 hashes, or 352 bytes of arbitrary data. This would also have to be a valid public key and signature pair. I don't see this approach as being practical.

Hunter

On Monday, February 24, 2025 at 8:27:53 AM UTC-7 Jonas Nick wrote:

> What prevents arbitrary data being hashed and then included in the attestation
> is, each signature public key pair must be able to verify the transaction
> message in order to be considered a valid transaction.

This appears to contradict the selective disclosure mechanism described in the
BIP and this sentence in the "Script Validation" section:

> Public keys that are not needed can be excluded by including their hash in the
> attestation accompanied with an empty signature

Even if the selective disclosure vulnerability is fixed by committing to the
multisig semantics in the P2QRH output, any unopened public key commitment could
still be "abused" for arbitrary data storage. Similar to the scenario in my
previous post, if the root R is MerkleRoot([leafhash1, leafhash2]) and the
multisig policy is "1-of-2", then we can set

leafhash1 := data
leafhash2 := hash(public_key_secp256k1)

and post the data to the chain by spending the output using an attestation
structure that includes leafhash1, an empty signature, public_key_secp256k1 and
the corresponding signature.

> I will admit I don't understand this attack. Can you provide more details on
> how it works, and how it might be possible to mitigate?

To give more context, this attack is intended as a concrete demonstration of how
breaking the collision resistance of the hash function used in the Merkle tree
can enable an adversary to steal coins. Here's a different explanation for
essentially the same attack in the context of P2SH vs. P2WSH:
https://bitcoin.stackexchange.com/a/54847/35586

The attack against the BIP's proposed signature scheme (where the Merkle tree is
constructed from public keys and then an ordinary signature scheme is applied to
one or more of the committed public keys) can be mitigated by using a hash
function with a larger output space (e.g., SHA-512).

However, I'm not suggesting to do this. My point is that while the BIP aims for
256 bits of security by using NIST strength level V parameters, it does not
actually achieve that security level (when the adversary can affect any of the
leaves as in multisignatures, for example).

The Bitcoin protocol relies heavily on collision-resistance of SHA-256, which is
pretty much the definition of NIST strength level II [0].

[0] https://csrc.nist.gov/projects/post-quantum-cryptography/post-quantum-cryptography-standardization/evaluation-criteria/security-(evaluation-criteria)