On Wed, May 22, 2019 at 10:06 PM Pieter Wuille <pieter.wuille@gmail.com> wrote:
On Tue, 21 May 2019 at 10:20, Russell O'Connor <roconnor@blockstream.io> wrote:
>
> Regarding Tapscript, the specification calls for the final value of the stack being a single non-false value:
>
>> The tapscript is executed according to the rules in the following section, with the initial stack as input
>>     II. If the execution results in anything but exactly one element on the stack which evaluates to true with CastToBool(), fail.
>
> Perhaps it is worth taking this opportunity here to remove a minor wart of the Script language and instead require the stack to be exactly empty upon completion.
>
> In addition to removing a potential malleability vector, I expect it would simplify development of Bitcoin Script.  A rule requiring an empty stack means that the conjunction (logical and) of two policies can be implemented by the simple concatenation of Bitcoin Scripts.  This combined with the taproot ability to form the disjunction (logical or) of policies by having multiple Merkle branches, means that the translation of a policy written in disjunctive normal form (the logical ors of logical ands of primitive policies) can be straightforwardly translated to a taproot of tapscript.
>
> That said, I think the developers of miniscript <http://bitcoin.sipa.be/miniscript/miniscript.html> are in a much better position to comment on whether my above intuition is correct given that they've had to implement a host of various calling conventions.  I understand that at least some of this complexity is due to Bitcoin Script's one element stack rule.

IIRC I looked into this a few months ago, and found that the spending
cost (script size + expected witness size) of the optimal script for
every Miniscript policy at most changes by 1 WU (in either direction)
by requiring an empty stack rather than a true value, though in a
(admittedly very arbitrarily biased) distribution, more policies were
improved by it than degraded. This is not taking Taproot into account
(as many of those policies in a Taproot-supporting world should
optimally make use of the internal key and Merkle tree, rather than
turning everything into a monolithic script). I expect that this may
make the impact somewhat larger, but still never more than a 1 WU
gain.

I don't think the spending cost changes justify this change, so the
remaining criteria are complexity ones. In my view, the main benefit
would be to authors of hand-written scripts where the consistency
benefits matter, but this needs to be weighed against the mental cost
of learning the new semantics. For Miniscript itself, this doesn't
make much difference - the top level calling convention would become
'V' instead of 'T', but 'T' would still exist as a calling convention
that may be used internally; it's a few lines change.

So overall this feels like something with marginal costs, but also at
most marginal benefits. Perhaps other people have stronger opinions.

Thanks for the info.  I'm surprised to learn that 'T' would still exist internally.  That does make my proposed ammendment a somewhat more marginal than I expected.  I still think it would be an improvement, but I guess it is acceptable the way it is if that is what other people prefer.
 
> Even if we choose not to implement the empty stack rule, we should at least require that the last element be 0x01 to remove a potential malleability vector and bring it in line with MINIMAL_IF semantics.

This feels like the right thing to do; as we're making MINIMALIF part
of consensus for Tapscript it would make sense to apply the same rule
to the "return" value of the script. There is a downside though,
namely that in some places where you'd use "<n>
OP_CHECKSEQUENCEVERIFY" or "<n> OP_CHECKLOCKTIMEVERIFY" you now need
to add an additional OP_0NOTEQUAL to convert the left-over element n
into an exact 0x01. I also can't come up with any practical benefits
that this would have; if the top stack element in a particular code
path comes directly from the input, it's insecure regardless; if there
isn't, it'll generally be a a boolean (or an intentional non-boolean
true value) computed by the script.

That is a very good argument.  If we were to go with an empty stack we'd probably also want modify to have CSV and CLTV pop their inputs off the stack.  But at that point perhaps we'd want to change their opcode values to avoid confusion with old style script.  I guess I'm getting more convinced to not touch this stuff just and just bear with the somewhat unfortunate legacy behaviour.