Hi Russell,

Thanks for the response. I double checked my work in drafting my response and realized I didn't address all the malleability concerns, I believe I have now (fingers crossed) addressed all points of malleability.

The malleability concerns are as follows:

A TXID is computed as:

def txid(self):
         r = b""
         r += struct.pack("<i", self.nVersion)
         r += ser_vector(self.vin)
         r += ser_vector(self.vout)
         r += struct.pack("<I", self.nLockTime)
         return sha256(r)

if the bag hash is just:

def get_bag_hash(self):
         r = b""
         r += ser_vector(self.vout)
         return TaggedHash("BagHash", r)

We allow changing a few things: nVersion, nLockTime, scriptSig (per input), number of inputs, nSequence (per input) which can change the TXID/what the transaction does.

changing nVersion: can disable BIP68, change TXID
changing nLockTime: can change TXID
changing nSequence: can change TXID
changing number of inputs: half spend problem, change TXID
changing scriptsigs: change TXID if co-spent with legacy input

Instead, we can use the following digest:

    def get_bag_hash(self):
         r = b""
         r += struct.pack("<i", self.nVersion)
         r += struct.pack("<I", self.nLockTime)
         r += sha256(b"".join(out.serialize() for out in self.vout))
         r += sha256(b"".join(struct.pack("<I", inp.nSequence) for inp in self.vin))
         r += struct.pack("<Q", len(self.vin))
         for inp in self.vin:
             r += ser_string(inp.scriptSig)
         return TaggedHash("BagHash", r)

which should lock in all the relevant bits. The only part left out is the COutpoint, which can't be known ahead of time (because it depends on the creating txn). Technically, len(vin) is redundant with sha256(b"".join(struct.pack("<I", inp.nSequence) for inp in self.vin)), because the length padding on the hash implied the number of inputs, but I figured it's best to err on explicit.

A further benefit (in a CISC sense) of committing to all these values is that we enforce CLTV and CSV semantics for free on OP_SECURETHEBAG scripts, which helps with channels.



Treating OP_SECURETHEBAG as a PUSHDATA:

I agree in theory it's nicer, and am 100% open to implementing it that way. The only concern I have with doing it this way is that it means that a flags must be added to GetOp (or GetOp must be modularized to be per-script version) because it affects script parsing, as opposed to using a multibyte opcode which contains a pushdata, which remain compatible with prior script parsing.

I'd like to get rough consensus on the best approach for compatibility with downstream software, hence choosing this option for the draft.

Personally, my preference is to *not* do flags and just have a separate parser version which cleans up some of our past sins. We can experiment with a fancier parser (as you've shown in Haskell/Rust/Coq), perhaps even bitwise huffman encoding opcodes to save space on scripts (i.e. the 7 most common opcodes could fit within 3 bits) or whatever else we like. I just didn't want to have the scope creep too far on this particular BIP, but I'm with you that lookahead is a hack compared to an actual parametrized argument.

I think you'd also appreciate the template script expansion approach mentioned in the BIP -- it gets around some of these concerns, but requires changes to Taproot.