(Subject was: [bitcoin-dev] Version 1 witness programs (first draft)), but I'm moving part of that conversation to this thread.

On Sun, Oct 1, 2017 at 5:32 PM, Johnson Lau <jl2012@xbt.hk> wrote:
3. Do we want to allow static analysis of sigop?
BIP114 and the related proposals are specifically designed to allow static analysis of sigop. I think this was one of the main reason of OP_EVAL not being accepted. This was also the main reason of Ethereum failing to do a DAO hacker softfork, leading to the ETH/ETC split. I’m not sure if we really want to give up this property. Once we do it, we have to support it forever.

I would very much like to retain the ability to do static analysis.  More generally, the idea of interpreting arbitrary data as code, as done in OP_EVAL and in TAILCALL, makes me quite anxious.  This at the root of many security problems throughout the software industry, and I don't relish giving more fuel to the underhanded Bitcoin Script contestants.
 
On Sun, Oct 1, 2017 at 8:45 PM, Luke Dashjr <luke@dashjr.org> wrote:
> 3. Do we want to allow static analysis of sigop?
> BIP114 and the related proposals are specifically designed to allow static
> analysis of sigop. I think this was one of the main reason of OP_EVAL not
> being accepted. This was also the main reason of Ethereum failing to do a
> DAO hacker softfork, leading to the ETH/ETC split. I’m not sure if we
> really want to give up this property. Once we do it, we have to support it
> forever.

It seems inevitable at this point. Maybe we could add a separate "executable-
witness" array (in the same manner as the current witness was softforked in),
and require tail-call and condition scripts to merely reference these by hash,
but I'm not sure it's worth the effort?

Thinking further, we could avoid adding a separate executable-witness
commitment by either:
A) Define that all the witness elements in v1 are type-tagged (put the minor
   witness version on them all, and redefine minor 0 as a stack item?); or
B) Use an empty element as a delimiter between stack and executable items.

To avoid witness malleability, the executable items can be required to be
sorted in some manner.

The downside of these approaches is that we now need an addition 20 or 32
bytes per script reference... which IMO may possibly be worse than losing
static analysis. I wonder if there's a way to avoid that overhead?

Actually, I have a half-baked idea I've been thinking about along these lines.

The idea is to add a flag to each stack item in the Script interpreter to mark whether the item in the stack is "executable" or "non-executable", not so different from how computers mark pages to implement executable space protection.  By default, all stack items are marked "non-executable".  We then redefine OP_PUSHDATA4 as OP_PUSHCODE within ScriptSigs.  The operational semantics of OP_PUSHCODE would remain the same as OP_PUSHDATA4 except it would set the pushed item's associated flag to "executable".  All data pushed by OP_PUSHCODE would be subject to the sigops limits and any other similar static analysis limits.

Segwit v0 doesn't use OP_PUSHDATA codes to create the input stack, so we would have to mark executable input stack items using a new witness v1 format. But, IIUC, TAILCALL isn't going to be compatible with Segwit v0 anyway.

During a TAILCALL, it is required that the top item on the stack have the "executable" flag, otherwise TAILCALL is not used (and the script succeeds or fails based on the top item's data value as usual).

All other operations can treat "executable" items as data, including the merkle branch verification.  None of the Script operations can create "executable" items; in particular, OP_PUSHDATA4 within the ScriptPubKey also would not create "executable" items.  (We can talk about the behaviour of OP_CAT when that time comes).

One last trick is that when "executable" values are duplicated, by OP_DUP, OP_IFDUP, OP_PICK. then the newly created copy of the value on top of the stack is marked "non-executable".

Because we make the "executable" flag non-copyable, we are now free to allow unbounded uses of TAILCALL (i.e. TAILCALL can be used multiplie times in a single input).  Why is this safe?  Because the number of "executable" items decreases by at least one every time TAILCALL is invoked. the number of OP_PUSHCODE occurrences in the witness puts an upper bound on the number of invocations of TAILCALL allowed.  Using static analysis of the script pubkey and the data within the OP_PUSHCODE data, we compute an upper bound on the number of operations (of any type) that can occur during execution.

Unbounded TAILCALL should let us (in the presence of OP_CHECKSIGFROMSTACK) have unbounded delegation.

Overall, I believe that OP_PUSHCODE

1. is fully backwards compatible.
2. maintains our ability to perform static analysis with TAILCALL.
3. never lets us interpret computed values as executable code.
4. extends TAILCALL to safely allow multiple TAILCALLs per script.