Re: [bitcoin-dev] Annex Purpose Discussion: OP_ANNEX, Turing Completeness, and other considerations

From: Antoine Riard <antoine.riard@gmail.com>
To: Jeremy Rubin <jeremy.l.rubin@gmail.com>,
	 Bitcoin Protocol Discussion
	<bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] Annex Purpose Discussion: OP_ANNEX, Turing Completeness, and other considerations
Date: Sun, 6 Mar 2022 19:59:33 -0500	[thread overview]
Message-ID: <CALZpt+GmX18Nbz5B7JDBP8eSRovhD3y4yjURr5zBO6CpSbpLtw@mail.gmail.com> (raw)
In-Reply-To: <CAD5xwhgXE9sB-hdzz_Bgz6iEA-M5-Yu2VOn1qRzkaq+DdVsgmw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 10543 bytes --]

Hi Jeremy,

> I've seen some discussion of what the Annex can be used for in Bitcoin.
For
> example, some people have discussed using the annex as a data field for
> something like CHECKSIGFROMSTACK type stuff (additional authenticated
data)
> or for something like delegation (the delegation is to the annex). I think
> before devs get too excited, we should have an open discussion about what
> this is actually for, and figure out if there are any constraints to using
> it however we may please.

I think one interesting purpose of the annex is to serve as a transaction
field extension, where we assign new consensus validity rules to the annex
payloads.

One could think about new types of locks, e.g where a transaction inclusion
is constrained before the annex payload value is superior to the chain's
`ChainWork`. This could be useful in case of contentious forks, where you
want your transaction to confirm only when enough work is accumulated, and
height isn't a reliable indicator anymore.

Or a relative-timelock where the endpoint is the presence of a state number
encumbering the spent transaction. This could be useful in the context of
payment pools, where the user withdraw transactions are all encumbered by a
bip68 relative-timelock, as you don't know which one is going to confirm
first, but where you don't care about enforcement of the timelocks once the
contestation delay has played once  and no higher-state update transaction
has confirmed.

Of course, we could reuse the nSequence field for some of those new types
of locks, though we would lose the flexibility of combining multiple locks
encumbering the same input.

Another use for the annex is locating there the SIGHASH_GROUP group count
value. One advantage over placing the value as a script stack item could be
to have annex payloads interdependency validity, where other annex payloads
are reusing the group count value as part of their own semantics.

Antoine

Le ven. 4 mars 2022 à 18:22, Jeremy Rubin via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> a écrit :

> I've seen some discussion of what the Annex can be used for in Bitcoin.
> For example, some people have discussed using the annex as a data field for
> something like CHECKSIGFROMSTACK type stuff (additional authenticated data)
> or for something like delegation (the delegation is to the annex). I think
> before devs get too excited, we should have an open discussion about what
> this is actually for, and figure out if there are any constraints to using
> it however we may please.
>
> The BIP is tight lipped about it's purpose, saying mostly only:
>
> *What is the purpose of the annex? The annex is a reserved space for
> future extensions, such as indicating the validation costs of
> computationally expensive new opcodes in a way that is recognizable without
> knowing the scriptPubKey of the output being spent. Until the meaning of
> this field is defined by another softfork, users SHOULD NOT include annex
> in transactions, or it may lead to PERMANENT FUND LOSS.*
>
> *The annex (or the lack of thereof) is always covered by the signature and
> contributes to transaction weight, but is otherwise ignored during taproot
> validation.*
>
> *Execute the script, according to the applicable script rules[11], using
> the witness stack elements excluding the script s, the control block c, and
> the annex a if present, as initial stack.*
>
> Essentially, I read this as saying: The annex is the ability to pad a
> transaction with an additional string of 0's that contribute to the virtual
> weight of a transaction, but has no validation cost itself. Therefore,
> somehow, if you needed to validate more signatures than 1 per 50 virtual
> weight units, you could add padding to buy extra gas. Or, we might somehow
> make the witness a small language (e.g., run length encoded zeros) such
> that we can very quickly compute an equivalent number of zeros to 'charge'
> without actually consuming the space but still consuming a linearizable
> resource... or something like that. We might also e.g. want to use the
> annex to reserve something else, like the amount of memory. In general, we
> are using the annex to express a resource constraint efficiently. This
> might be useful for e.g. simplicity one day.
>
> Generating an Annex: One should write a tracing executor for a script, run
> it, measure the resource costs, and then generate an annex that captures
> any externalized costs.
>
> -------------------
>
> Introducing OP_ANNEX: Suppose there were some sort of annex pushing
> opcode, OP_ANNEX which puts the annex on the stack as well as a 0 or 1 (to
> differentiate annex is 0 from no annex, e.g. 0 1 means annex was 0 and 0 0
> means no annex). This would be equivalent to something based on <annex
> flag> OP_TXHASH <has annex flag> OP_TXHASH.
>
> Now suppose that I have a computation that I am running in a script as
> follows:
>
> OP_ANNEX
> OP_IF
>     `some operation that requires annex to be <1>`
> OP_ELSE
>     OP_SIZE
>     `some operation that requires annex to be len(annex) + 1 or does a
> checksig`
> OP_ENDIF
>
> Now every time you run this, it requires one more resource unit than the
> last time you ran it, which makes your satisfier use the annex as some sort
> of "scratch space" for a looping construct, where you compute a new annex,
> loop with that value, and see if that annex is now accepted by the program.
>
> In short, it kinda seems like being able to read the annex off of the
> stack makes witness construction somehow turing complete, because we can
> use it as a register/tape for some sort of computational model.
>
> -------------------
>
> This seems at odds with using the annex as something that just helps you
> heuristically guess  computation costs, now it's somehow something that
> acts to make script satisfiers recursive.
>
> Because the Annex is signed, and must be the same, this can also be
> inconvenient:
>
> Suppose that you have a Miniscript that is something like: and(or(PK(A),
> PK(A')), X, or(PK(B), PK(B'))).
>
> A or A' should sign with B or B'. X is some sort of fragment that might
> require a value that is unknown (and maybe recursively defined?) so
> therefore if we send the PSBT to A first, which commits to the annex, and
> then X reads the annex and say it must be something else, A must sign
> again. So you might say, run X first, and then sign with A and C or B.
> However, what if the script somehow detects the bitstring WHICH_A WHICH_B
> and has a different Annex per selection (e.g., interpret the bitstring as a
> int and annex must == that int). Now, given and(or(K1, K1'),... or(Kn,
> Kn')) we end up with needing to pre-sign 2**n annex values somehow... this
> seems problematic theoretically.
>
> Of course this wouldn't be miniscript then. Because miniscript is just for
> the well behaved subset of script, and this seems ill behaved. So maybe
> we're OK?
>
> But I think the issue still arises where suppose I have a simple thing
> like: and(COLD_LOGIC, HOT_LOGIC) where both contains a signature, if
> COLD_LOGIC and HOT_LOGIC can both have different costs, I need to decide
> what logic each satisfier for the branch is going to use in advance, or
> sign all possible sums of both our annex costs? This could come up if
> cold/hot e.g. use different numbers of signatures / use checksigCISAadd
> which maybe requires an annex argument.
>
>
>
> ------------
>
> It seems like one good option is if we just go on and banish the OP_ANNEX.
> Maybe that solves some of this? I sort of think so. It definitely seems
> like we're not supposed to access it via script, given the quote from above:
>
> *Execute the script, according to the applicable script rules[11], using
> the witness stack elements excluding the script s, the control block c, and
> the annex a if present, as initial stack.*
>
> If we were meant to have it, we would have not nixed it from the stack,
> no? Or would have made the opcode for it as a part of taproot...
>
> But recall that the annex is committed to by the signature.
>
> So it's only a matter of time till we see some sort of Cat and Schnorr
> Tricks III the Annex Edition that lets you use G cleverly to get the annex
> onto the stack again, and then it's like we had OP_ANNEX all along, or
> without CAT, at least something that we can detect that the value has
> changed and cause this satisfier looping issue somehow.
>
> Not to mention if we just got OP_TXHASH
>
>
>
> -----------
>
> Is the annex bad? After writing this I sort of think so?
>
> One solution would be to... just soft-fork it out. Always must be 0. When
> we come up with a use case for something like an annex, we can find a way
> to add it back.  Maybe this means somehow pushing multiple annexes and
> having an annex stack, where only sub-segments are signed for the last
> executed signature? That would solve looping... but would it break some
> aggregation thing? Maybe.
>
>
> Another solution would be to make it so the annex is never committed to
> and unobservable from the script, but that the annex is always something
> that you can run get_annex(stack) to generate the annex. Thus it is a hint
> for validation rules, but not directly readable, and if it is modified you
> figure out the txn was cheaper sometime after you execute the scripts and
> can decrease the value when you relay. But this sounds like something that
> needs to be a p2p only annex, because consensus we may not care (unless
> it's something like preallocating memory for validation?).
>
> -----------------------
>
> Overall my preference is -- perhaps sadly -- looking like we should
> soft-fork it out of our current Checksig (making the policy that it must 0
> a consensus rule) and redesign the annex technique later when we actually
> know what it is for with a new checksig or other mechanism. But It's not a
> hard opinion! It just seems like you can't practically use the annex for
> this worklimit type thing *and* observe it from the stack meaningfully.
>
>
>
> Thanks for coming to my ted-talk,
>
> Jeremy
>
>
> --
> @JeremyRubin <https://twitter.com/JeremyRubin>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>

[-- Attachment #2: Type: text/html, Size: 17677 bytes --]