* [bitcoin-dev] bitcoin scripting and lisp
@ 2022-03-04 1:04 Anthony Towns
2022-03-04 23:10 ` ZmnSCPxj
0 siblings, 1 reply; 23+ messages in thread
From: Anthony Towns @ 2022-03-04 1:04 UTC (permalink / raw)
To: Bitcoin Protocol Discussion
On Sun, Feb 27, 2022 at 04:34:31PM +0000, ZmnSCPxj via bitcoin-dev wrote:
> In reaction to this, AJ Towns mailed me privately about some of his
> thoughts on this insane `OP_EVICT` proposal.
> He observed that we could generalize the `OP_EVICT` opcode by
> decomposing it into smaller parts, including an operation congruent
> to the Scheme/Haskell/Scala `map` operation.
At much the same time Zman was thinking about OP_FOLD and in exactly the
same context, I was wondering what the simplest possible language that
had some sort of map construction was -- I mean simplest in a "practical
engineering" sense; I think Simplicity already has the Euclidean/Peano
"least axioms" sense covered.
The thing that's most appealing to me about bitcoin script as it stands
(beyond "it works") is that it's really pretty simple in an engineering
sense: it's just a "forth" like system, where you put byte strings on a
stack and have a few operators to manipulate them. The alt-stack, and
supporting "IF" and "CODESEPARATOR" add a little additional complexity,
but really not very much.
To level-up from that, instead of putting byte strings on a stack, you
could have some other data structure than a stack -- eg one that allows
nesting. Simple ones that come to mind are lists of (lists of) byte
strings, or a binary tree of byte strings [0]. Both those essentially
give you a lisp-like language -- lisp is obviously all about lists,
and a binary tree is just made of things or pairs of things, and pairs
of things are just another way of saying "car" and "cdr".
A particular advantage of lisp-like approaches is that they treat code
and data exactly the same -- so if we're trying to leave the option open
for a transaction to supply some unexpected code on the witness stack,
then lisp handles that really naturally: you were going to include data
on the stack anyway, and code and data are the same, so you don't have
to do anything special at all. And while I've never really coded in
lisp at all, my understanding is that its biggest problems are all about
doing things efficiently at large scales -- but script's problem space
is for very small scale things, so there's at least reason to hope that
any problems lisp might have won't actually show up for this use case.
So, to me, that seemed like something worth looking into...
After looking into it, I actually think chia lisp [1] gets pretty much all
the major design decisions pretty much right. There are obviously a few
changes needed given the differences in design between chia and bitcoin:
- having secp256k1 signatures (and curve operations), instead of
BLS12-381 ones
- adding tx introspection instead of having bundle-oriented CREATE_COIN,
and CREATE/ASSERT results [10]
and there are a couple of other things that could maybe be improved
upon:
- serialization seems to be a bit verbose -- 100kB of serialized clvm
code from a random block gzips to 60kB; optimising the serialization
for small lists, and perhaps also for small literal numbers might be
a feasible improvement; though it's not clear to me how frequently
serialization size would be the limiting factor for cost versus
execution time or memory usage.
- I don't think execution costing takes into account how much memory
is used at any one time, just how much was allocated in total; so
the equivalent of (OP_DUP OP_DROP OP_DUP OP_DROP ..) only has the
allocations accounted for, with no discount given for the immediate
freeing, so it gets treated as having the same cost as (OP_DUP
OP_DUP .. OP_DROP OP_DROP ..). Doing it that way would be a worse
than how bitcoin script is currently costed, but doing better might
mean locking in an evaluation method at the consensus level. Seems
worth looking into, at least.
But otherwise, it seems a pretty good match.
I think you'd need about 40 opcodes to match bitcoin script and (roughly)
chia lisp, something like:
q - quote
a - apply
x - exception / immediately fail (OP_RETURN style)
i - if/then/else
softfork - upgradability
not, all, any - boolean logic
bitand, bitor, bitxor, bitnot, shift - bitwise logic
= - bitwise equality
> - + * / divmod - (signed, bignum) arithmetic
ashift - arithmetic shift (sign extended)
>s - string comparison
strlen, substr, concat - string ops
f, r, c, l - list ops (head, tail, make a list, is this a list?)
sha256 - hashing
numequal - arithmetic equal, equivalent to (= (+ a 0) (+ b 0))
ripemd160, hash160, hash256 - more hashing
bip342-txmsg - given a sighash byte, construct the bip342 message
bip340-verify - given a pubkey, message, and signature bip340 verify it
tx - get various information about the tx
taproot - get merkle path/internalpubkey/program/annex information
ecdsa - same as bip340-verify, except for traditional ecdsa?
secp256k1-muladd - given (a B C) where B,C are points, calculate a*B+C?
That compares to about 60 (non-disabled) opcodes in current script.
Pretty much all the opcodes in the first section are directly from chia
lisp, while all the rest are to complete the "bitcoin" functionality.
The last two are extensions that are more food for thought than a real
proposal.
Using a lisp-style approach seems an improvement in general to me.
For example, rather than the streaming-sha256 approach in Elements,
where you could write:
"a" SHA256INITIALIZE
"b" SHA256UPDATE
"c" SHA256UPDATE
"d" SHA256FINALIZE
to get the sha256 of "abcd" without having to CAT them first (important
if they'd potentially overflow the 520B stack item limit), in chia lisp
you write:
(sha256 "a" "b" "c" "d")
which still has the benefit of streaming the inputs into the function,
but only adds a single opcode, doesn't involve representing the internal
sha256 midstate on the stack, and generally seems easier to understand,
at least to me.
As another example, following the traditional functional "tail recursion
instead of for-loops" approach, doing CHECKMULTISIG might become
something like:
(defun checksig (sig key)
bip340-verify (f sig) (bip342-txmsg (r sig)) key)
(defun checkmultisig (sigs keys k)
if (= k 0)
1
(if (l sigs)
(if (checksig (f sigs) (f keys))
(checkmultisig (r sigs) (r keys) (- k 1))
(checkmultisig sigs (r keys) k)
)
0
)
)
Here each "sig" is a pair of a 64B bip340 signature and a 1B sighash;
instead of a 65B string combining both, and sigs, keys are lists, and k
is the number of successful signature checks you're requiring for
success.
Of course, "defun" and "if" aren't listed as opcodes above; instead you
have a compiler that gives you nice macros like defun and translates them
into correct uses of the "a" opcode, etc. As I understand it, those sort
of macros and translations are pretty well understood across lisp-like
languages, and, of course, they're already implemented for chia lisp.
I think with the "tx" opcode defined similarly to how Rusty suggested it
[2] you could implement OP_CTV-like behaviours in similar way, and also
replace "bip342-txmsg" with your own code to generate SIGHASH_ANYPREVOUT
or SIGHASH_GROUP style messages to sign. (This would mean also being able
to pull information about the utxo being spent to obtain its amount and
scriptpubkey, which are committed to wit ANYPREVOUT. If it was also able
to obtain the "is_coinbase" flag, that might allow you a more accurate
covenant-based implementation of drivechains...)
Likewise, with the "taproot" opcode defined in a way that lets you extract
out the internal public key and merkle path, I think you could implement
OP_TLUV and OP_EVICT with a similar recursive approach.
There's two ways to think about upgradability here; if someday we want
to add new opcodes to the language -- perhaps something to validate zero
knowledge proofs or calculate sha3 or use a different ECC curve, or some
way to support cross-input signature aggregation, or perhaps it's just
that some snippets are very widely used and we'd like to code them in
C++ directly so they validate quicker and don't use up as much block
weight. One approach is to just define a new version of the language
via the tapleaf version, defining new opcodes however we like.
The other is to use the "softfork" opcode -- chia defines it as:
(softfork cost code)
though I think it would probably be better if it were
(softfork cost version code)
where the idea is that "code" will use the "x" opcode if there's a
problem, and anyone supporting the "version" softfork can verify that
there aren't any problems at a cost of "cost". However, whether you
do or don't support that softfork, as far as the rest of the script is
concerned, the expression will either fail entirely or evaluate as zero;
so anyone who doesn't support the softfork can just replace it with zero
and continue on, treating it as if it had costed "cost" units.
One thing worth noting: "softfork" behaves more like OP_NOP than
tapscript's OP_SUCCESS -- I think it's just not possible in general to
have OP_SUCCESS-like behaviour if you're trying to allow accepting code
from the witness data -- otherwise as soon as you reveal that your script
does accept arbitrary code supplied by the spender, someone could stick
in an OP_SUCCESS code, and remove all the restrictions on spending and
steal your funds.
To me, it seems like chia lisp is a better answer to the problem here
than the Simplicity language. Simplicity is complicated in a few ways:
- it defines over 100 jets (plus another 6 for sha3, and another 45 for
individual libsecp256k1 functions) that need to be implemented
natively to efficiently track consensus [3]
- as far as I know, how to soft-fork in new jets isn't yet well
established. I think the ideal is that you just write everything in
raw simplicity, and either the interpreter has a jet and does things
quickly, or doesn't, and gets the same result much more slowly [4]. But
that approach doesn't seem compatible with maintaining consensus,
when "slowly" can be slower by more than 6 orders of magnitude [5].
- to understand what's going on with a smart contract, you need to
understand both simplicity (to define the program) and the bit machine
(to follow how it's computed), both of which are fairly novel --
and if nobody's directly coding in simplicity which seems likely,
you're adding a third layer on top of that: you want to understand
what the programmer asked for (the source), what is included in the
transaction (the simplicity code) and how that's executed by consensus
code (via the bit machine).
There's a branch for enabling simplicity on elements/liquid [6]
to see what enabling simplicity might involve in concrete terms; it
just... doesn't seem at all simple in practice to me. I can't see how
you'd reasonably translate that simplicity code into a BIP series that
anyone could understand, or how you'd do a thorough review of all the
changes...
By contrast, chia lisp has fewer opcodes than Simplicity's jets, has
feasible approaches to low-impact soft forks to increase functionality,
can be used with only two levels of abstraction (lisp with macros and
the opcodes-only vm level) that seem not too bad to understand, and
(in my opinion) doesn't seem too hard to implement/maintain reasonably.
On the other hand, Simplicity's big advantage over *everything* else
is in formal verification. But I'm not really seeing why you couldn't
preserve that advantage by writing simplicity definitions for the "lisp"
opcodes [7], so that you can "compile" the lisp programs to simplicity,
and then verify them however you like.
One of the things people sometimes claim about bitcoin as an asset,
is that it's got both the advantage of having been first to market,
but also that if some altcoin comes along with great new ideas, then
those ideas can just be incorporated into bitcoin too, so bitcoin can
preserve it's lead even from innovators. Granted, I've only really been
looking at chia lisp for a bit over a week, but it really seems to me
like a case where it might be worth putting that philosophy into practice.
If we were to adopt this, obviously we shouldn't call it "chia lisp"
anymore, since it wouldn't work the same in important ways. But since
the program would be encoded as a binary-tree of car/cdr pairs, maybe
we could call it "binary-tree coded script", or "btc-script" for short...
PS: related tweets: [8]
Cheers,
aj
[0] You could also allow things to be pushed onto the stack that
(recursively) can push things onto the stack -- the language "Joy"
takes this approach. It seems to end up equivalent to doing things
in a list oriented way to me.
[1] https://chialisp.com/docs/ref/clvm
[2] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-February/019871.html
[3] https://raw.githubusercontent.com/ElementsProject/simplicity/pdf/Simplicity-TR.pdf
[4] https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-November/015244.html
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-October/015227.html
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2017-October/015228.html
[5] https://medium.com/blockstream/simplicity-jets-release-803db10fd589
[6] https://github.com/ElementsProject/elements/compare/simplicity
[7] At least I think so? chia lisp does multibyte math, that is "+"
can accept "arbitrary" length bytestrings, which it interprets as
numbers and adds together; whereas Simplicity requires finite types.
I think you could decide "a program that only costs X can't have any
bytestrings greater than length k*X" and construct a finite type up to
that length, maybe? So long as you're only using it for verification,
maybe that stays feasible? Or perhaps you could arbitrarily limit
the strings to a max of 520 bytes at a consensus level, and the
corresponding Simplicity types to 4160 bits and go from there?
[8] https://twitter.com/brian_trollz/status/1499048316956549123
https://twitter.com/jb55/status/1499045998315724801
[9] Oops, out of order footnotes. Anyway...
[10] [9] The CREATE/ASSERT bundling stuff is interesting; and could be
used to achieve functionality like the "transaction sponsorship"
stuff. It doesn't magically solve the issues with maintaining the
mempool and using that to speed up block acceptance, though, and
the chia chain has apparently suffered from mempool-flooding attacks
recently [11] so I don't think they've solved the broader problem,
and thus I think it still makes more sense to stick with bitcoin's
current model here.
[11] https://thechiaplot.net/2021/11/03/interview-with-the-chia-dust-stormer/
https://github.com/Chia-Network/post-mortem/blob/main/post-mortem.md
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-04 1:04 [bitcoin-dev] bitcoin scripting and lisp Anthony Towns
@ 2022-03-04 23:10 ` ZmnSCPxj
[not found] ` <CAD5xwhiZx+dp46Gn23tQRKc5PgJHmaJ_HC-38VB5WdJjWVVc4g@mail.gmail.com>
0 siblings, 1 reply; 23+ messages in thread
From: ZmnSCPxj @ 2022-03-04 23:10 UTC (permalink / raw)
To: Anthony Towns, Bitcoin Protocol Discussion
Good morning aj,
> On Sun, Feb 27, 2022 at 04:34:31PM +0000, ZmnSCPxj via bitcoin-dev wrote:
>
> > In reaction to this, AJ Towns mailed me privately about some of his
> > thoughts on this insane `OP_EVICT` proposal.
> > He observed that we could generalize the `OP_EVICT` opcode by
> > decomposing it into smaller parts, including an operation congruent
> > to the Scheme/Haskell/Scala `map` operation.
>
> At much the same time Zman was thinking about OP_FOLD and in exactly the
> same context, I was wondering what the simplest possible language that
> had some sort of map construction was -- I mean simplest in a "practical
> engineering" sense; I think Simplicity already has the Euclidean/Peano
> "least axioms" sense covered.
>
> The thing that's most appealing to me about bitcoin script as it stands
> (beyond "it works") is that it's really pretty simple in an engineering
> sense: it's just a "forth" like system, where you put byte strings on a
> stack and have a few operators to manipulate them. The alt-stack, and
> supporting "IF" and "CODESEPARATOR" add a little additional complexity,
> but really not very much.
>
> To level-up from that, instead of putting byte strings on a stack, you
> could have some other data structure than a stack -- eg one that allows
> nesting. Simple ones that come to mind are lists of (lists of) byte
> strings, or a binary tree of byte strings [0]. Both those essentially
> give you a lisp-like language -- lisp is obviously all about lists,
> and a binary tree is just made of things or pairs of things, and pairs
> of things are just another way of saying "car" and "cdr".
>
> A particular advantage of lisp-like approaches is that they treat code
> and data exactly the same -- so if we're trying to leave the option open
> for a transaction to supply some unexpected code on the witness stack,
> then lisp handles that really naturally: you were going to include data
> on the stack anyway, and code and data are the same, so you don't have
> to do anything special at all. And while I've never really coded in
> lisp at all, my understanding is that its biggest problems are all about
> doing things efficiently at large scales -- but script's problem space
> is for very small scale things, so there's at least reason to hope that
> any problems lisp might have won't actually show up for this use case.
I heartily endorse LISP --- it has a trivial implementation of `eval` that is easily implementable once you have defined a proper data type in preferred-language-here to represent LISP datums.
Combine it with your idea of committing to a max-number-of-operations (which increases the weight of the transaction) and you may very well have something viable.
(In particular, even though `eval` is traditionally (re-)implemented in LISP itself, the limit on max-number-of-operations means any `eval` implementation within the same language is also forcibly made total.)
Of note is that the supposed "problem at scale" of LISP is, as I understand it, due precisely to its code and data being homoiconic to each other.
This homoiconicity greatly tempts LISP programmers to use macros, i.e. programs that generate other programs from some input syntax.
Homoiconicity means that one can manipulate code just as easily as the data, and thus LISP macros are a trivial extension on the language.
This allows each LISP programmer to just code up a macro to expand common patterns.
However, each LISP programmer then ends up implementing *different*, but *similar* macros from each other.
Unfortunately, programming at scale requires multiple programmers speaking the same language.
Then programming at scale is hampered because each LISP programmer has their own private dialect of LISP (formed from the common LISP language and from their own extensive set of private macros) and intercommunication between them is hindered by the fact that each one speaks their own private dialect.
Some LISP-like languages (e.g. Scheme) have classically targeted a "small" subset of absolutely-necessary operations, and each implementation of the language immediately becomes a new dialect due to having slightly different forms for roughly the same convenience function or macro, and *then* individual programmers build their own private dialect on top.
For Scheme specifically, R7RS has targeted providing a "large" standard as well, as did R6RS (which only *had* a "large" standard), but individual Scheme implementations have not always liked to implement *all* the "large" standard.
Otherwise, every big C program contains a half-assed implementation of half of Common LISP, so ----
> - I don't think execution costing takes into account how much memory
> is used at any one time, just how much was allocated in total; so
> the equivalent of (OP_DUP OP_DROP OP_DUP OP_DROP ..) only has the
> allocations accounted for, with no discount given for the immediate
> freeing, so it gets treated as having the same cost as (OP_DUP
> OP_DUP .. OP_DROP OP_DROP ..). Doing it that way would be a worse
> than how bitcoin script is currently costed, but doing better might
> mean locking in an evaluation method at the consensus level. Seems
> worth looking into, at least.
This may depend on the language that the interpreter is written in.
For example, on a typical GC language, both the N * `OP_DUP OP_DROP` and the N * `OP_DUP` + N * `OP_DROP` will have similar behavior when allocated at the nursery.
Since the GC nursery acts as a large buffer of potential allocations, the amount of work done in both cases would be the same, at least until the number of allocs exceeds the nursery size.
Alternately, the implementation may use immutable byte vectors, in which case `OP_DUP` is just a pointer copy.
Or alternately the implementation may use copy-on-write byte vectors, in which case `OP_DUP` is just a pointer copy plus refcount increment, and `OP_DROP` is just a refcount decrement, and the amount of memory used remains small.
> There's two ways to think about upgradability here; if someday we want
> to add new opcodes to the language -- perhaps something to validate zero
> knowledge proofs or calculate sha3 or use a different ECC curve, or some
> way to support cross-input signature aggregation, or perhaps it's just
> that some snippets are very widely used and we'd like to code them in
> C++ directly so they validate quicker and don't use up as much block
> weight. One approach is to just define a new version of the language
> via the tapleaf version, defining new opcodes however we like.
>
> The other is to use the "softfork" opcode -- chia defines it as:
>
> (softfork cost code)
>
> though I think it would probably be better if it were
>
> (softfork cost version code)
>
> where the idea is that "code" will use the "x" opcode if there's a
> problem, and anyone supporting the "version" softfork can verify that
> there aren't any problems at a cost of "cost". However, whether you
> do or don't support that softfork, as far as the rest of the script is
> concerned, the expression will either fail entirely or evaluate as zero;
> so anyone who doesn't support the softfork can just replace it with zero
> and continue on, treating it as if it had costed "cost" units.
>
> One thing worth noting: "softfork" behaves more like OP_NOP than
> tapscript's OP_SUCCESS -- I think it's just not possible in general to
> have OP_SUCCESS-like behaviour if you're trying to allow accepting code
> from the witness data -- otherwise as soon as you reveal that your script
> does accept arbitrary code supplied by the spender, someone could stick
> in an OP_SUCCESS code, and remove all the restrictions on spending and
> steal your funds.
Oh no `1 RETURN`!
Well, the advantage of chialisp here is that it enables the opcode for a *block* of code, so the opcode *itself* could return arbitrary data, it is just the entire block of code that is restricted to returning `0`.
So it would be something like an `OP_BEGINSOFTFORK .... OP_ENDSOFTFORK` where any reserved opcodes in the middle have arbitrary behavior, the entire block gets a *copy* of the current stack and alt stack, with any changes to the stack / alt stack disappear after the `OP_ENDSOFTFORK`.
Thus, the entire block as a whole would act as an `OP_NOP`.
(OG Bitcoin SCRIPT FTW!)
I think the `softfork` form would have to be a syntax though and not a procedure, as I think you want `cost` to be statically determined, and very likely also `version`.
Regards,
ZmnSCPxj
^ permalink raw reply [flat|nested] 23+ messages in thread
[parent not found: <mailman.30513.1646355894.8511.bitcoin-dev@lists.linuxfoundation.org>]
* [bitcoin-dev] bitcoin scripting and lisp
[not found] <mailman.30513.1646355894.8511.bitcoin-dev@lists.linuxfoundation.org>
@ 2022-03-07 6:26 ` Bram Cohen
2022-03-07 22:56 ` ZmnSCPxj
2022-03-08 1:27 ` Anthony Towns
0 siblings, 2 replies; 23+ messages in thread
From: Bram Cohen @ 2022-03-07 6:26 UTC (permalink / raw)
To: Bitcoin Protocol Discussion
[-- Attachment #1: Type: text/plain, Size: 6097 bytes --]
>
> After looking into it, I actually think chia lisp [1] gets pretty much all
> the major design decisions pretty much right. There are obviously a few
> changes needed given the differences in design between chia and bitcoin:
>
> - having secp256k1 signatures (and curve operations), instead of
> BLS12-381 ones
>
> - adding tx introspection instead of having bundle-oriented CREATE_COIN,
> and CREATE/ASSERT results [10]
>
Bitcoin uses the UTXO model as opposed to Chia's Coin Set model. While
these are close enough that it's often explained as Chia uses the UTXO
model but that isn't technically true. Relevant to the above comment is
that in the UTXO model transactions get passed to a scriptpubkey and it
either assert fails or it doesn't, while in the coin set model each puzzle
(scriptpubkey) gets run and either assert fails or returns a list of extra
conditions it has, possibly including timelocks and creating new coins,
paying fees, and other things.
If you're doing everything from scratch it's cleaner to go with the coin
set model, but retrofitting onto existing Bitcoin it may be best to leave
the UTXO model intact and compensate by adding a bunch more opcodes which
are special to parsing Bitcoin transactions. The transaction format itself
can be mostly left alone but to enable some of the extra tricks (mostly
implementing capabilities) it's probably a good idea to make new
conventions for how a transaction can have advisory information which
specifies which of the inputs to a transaction is the parent of a specific
output and also info which is used for communication between the UTXOs in a
transaction.
But one could also make lisp-generated UTXOs be based off transactions
which look completely trivial and have all their important information be
stored separately in a new vbytes area. That works but results in a bit of
a dual identity where some coins have both an old style id and a new style
id which gunks up what
>
> - serialization seems to be a bit verbose -- 100kB of serialized clvm
> code from a random block gzips to 60kB; optimising the serialization
> for small lists, and perhaps also for small literal numbers might be
> a feasible improvement; though it's not clear to me how frequently
> serialization size would be the limiting factor for cost versus
> execution time or memory usage.
>
A lot of this is because there's a hook for doing compression at the
consensus layer which isn't being used aggressively yet. That one has the
downside that the combined cost of transactions can add up very
nonlinearly, but when you have constantly repeated bits of large
boilerplate it gets close and there isn't much of an alternative. That said
even with that form of compression maxxed out it's likely that gzip could
still do some compression but that would be better done in the database and
in wire protocol formats rather than changing the format which is hashed at
the consensus layer.
> Pretty much all the opcodes in the first section are directly from chia
> lisp, while all the rest are to complete the "bitcoin" functionality.
> The last two are extensions that are more food for thought than a real
> proposal.
>
Are you thinking of this as a completely alternative script format or an
extension to bitcoin script? They're radically different approaches and
it's hard to see how they mix. Everything in lisp is completely sandboxed,
and that functionality is important to a lot of things, and it's really
normal to be given a reveal of a scriptpubkey and be able to rely on your
parsing of it.
> There's two ways to think about upgradability here; if someday we want
> to add new opcodes to the language -- perhaps something to validate zero
> knowledge proofs or calculate sha3 or use a different ECC curve, or some
> way to support cross-input signature aggregation, or perhaps it's just
> that some snippets are very widely used and we'd like to code them in
> C++ directly so they validate quicker and don't use up as much block
> weight. One approach is to just define a new version of the language
> via the tapleaf version, defining new opcodes however we like.
>
A nice side benefit of sticking with the UTXO model is that the soft fork
hook can be that all unknown opcodes make the entire thing automatically
pass.
>
> The other is to use the "softfork" opcode -- chia defines it as:
>
> (softfork cost code)
>
> though I think it would probably be better if it were
>
> (softfork cost version code)
>
Since softfork has to date never been used that second parameter is
technically completely ignored and could be anything at all. Most likely a
convention including some kind of version information will be created the
first time it's used. Also Chia shoves total cost into blocks at the
consensus layer out of an abundance of caution although that isn't
technically necessary.
[10] [9] The CREATE/ASSERT bundling stuff is interesting; and could be
> used to achieve functionality like the "transaction sponsorship"
> stuff. It doesn't magically solve the issues with maintaining the
> mempool and using that to speed up block acceptance, though, and
> the chia chain has apparently suffered from mempool-flooding attacks
> recently [11] so I don't think they've solved the broader problem,
>
Chia's approach to transaction fees is essentially identical to Bitcoin's
although a lot fewer things in the ecosystem support fees due to a lack of
having needed it yet. I don't think mempool issues have much to do with
choice of scriptpubkey language. which is mostly about adding in covenants
and capabilities.
That said, Ethereum does have trivial aggregation of unrelated
transactions, and the expense of almost everything else. There are a bunch
of ways automatic aggregation functionality could be added to coin set
mempools by giving them some understanding of the semantics of some
transactions, but that hasn't been implemented yet.
I previously posted some thoughts about this here:
https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-December/019722.html
[-- Attachment #2: Type: text/html, Size: 7673 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-07 6:26 ` Bram Cohen
@ 2022-03-07 22:56 ` ZmnSCPxj
2022-03-09 2:24 ` Bram Cohen
2022-03-08 1:27 ` Anthony Towns
1 sibling, 1 reply; 23+ messages in thread
From: ZmnSCPxj @ 2022-03-07 22:56 UTC (permalink / raw)
To: Bram Cohen, Bitcoin Protocol Discussion
Good morning Bram,
> while in the coin set model each puzzle (scriptpubkey) gets run and either assert fails or returns a list of extra conditions it has, possibly including timelocks and creating new coins, paying fees, and other things.
Does this mean it basically gets recursive covenants?
Or is a condition in this list of conditions written a more restrictive language which itself cannot return a list of conditions?
> > - serialization seems to be a bit verbose -- 100kB of serialized clvm
> > code from a random block gzips to 60kB; optimising the serialization
> > for small lists, and perhaps also for small literal numbers might be
> > a feasible improvement; though it's not clear to me how frequently
> > serialization size would be the limiting factor for cost versus
> > execution time or memory usage.
>
> A lot of this is because there's a hook for doing compression at the consensus layer which isn't being used aggressively yet. That one has the downside that the combined cost of transactions can add up very nonlinearly, but when you have constantly repeated bits of large boilerplate it gets close and there isn't much of an alternative. That said even with that form of compression maxxed out it's likely that gzip could still do some compression but that would be better done in the database and in wire protocol formats rather than changing the format which is hashed at the consensus layer.
How different is this from "jets" as proposed in Simplicity?
> > Pretty much all the opcodes in the first section are directly from chia
> > lisp, while all the rest are to complete the "bitcoin" functionality.
> > The last two are extensions that are more food for thought than a real
> > proposal.
>
> Are you thinking of this as a completely alternative script format or an extension to bitcoin script? They're radically different approaches and it's hard to see how they mix. Everything in lisp is completely sandboxed, and that functionality is important to a lot of things, and it's really normal to be given a reveal of a scriptpubkey and be able to rely on your parsing of it.
I believe AJ is proposing a completely alternative format to OG Bitcoin SCRIPT.
Basically, as I understand it, nothing in the design of Tapscript versions prevents us from completely changing the interpretation of Tapscript bytes, and use a completely different language.
That is, we could designate a new Tapscript version as completely different from OG Bitcoin SCRIPT.
Regards,
ZmnSCPxj
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-07 22:56 ` ZmnSCPxj
@ 2022-03-09 2:24 ` Bram Cohen
0 siblings, 0 replies; 23+ messages in thread
From: Bram Cohen @ 2022-03-09 2:24 UTC (permalink / raw)
To: ZmnSCPxj; +Cc: Bitcoin Protocol Discussion
[-- Attachment #1: Type: text/plain, Size: 2106 bytes --]
On Mon, Mar 7, 2022 at 2:56 PM ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote:
>
> > while in the coin set model each puzzle (scriptpubkey) gets run and
> either assert fails or returns a list of extra conditions it has, possibly
> including timelocks and creating new coins, paying fees, and other things.
>
> Does this mean it basically gets recursive covenants?
> Or is a condition in this list of conditions written a more restrictive
> language which itself cannot return a list of conditions?
>
The conditions language is extremely restrictive but does allow for
recursive covenants through the route of specifying output
scriptpubkeys/puzzles, which Bitcoin already sort of in principle supports
except that Bitcoin script's ability to generate and parse transactions
isn't quite up to the task.
> > A lot of this is because there's a hook for doing compression at the
> consensus layer which isn't being used aggressively yet. That one has the
> downside that the combined cost of transactions can add up very
> nonlinearly, but when you have constantly repeated bits of large
> boilerplate it gets close and there isn't much of an alternative. That said
> even with that form of compression maxxed out it's likely that gzip could
> still do some compression but that would be better done in the database and
> in wire protocol formats rather than changing the format which is hashed at
> the consensus layer.
>
> How different is this from "jets" as proposed in Simplicity?
>
My vague impression is that Simplicity jets are meant to be for things like
Sha3 rather than shared library code. It might be that the way to use it
would be to implement CLVM opcodes be a bunch of Simplicity jets. Whether
that would be making the CLVM irrelevant or going through a pointless bit
of theatre to be based on Simplicity under the hood I don't know.
The compression hook is that in each block instead of there being a list of
puzzle reveals and solutions there's a generator written in CLVM which
outputs that list, and it can be passed the generators of old blocks from
which it can pull out code snippits.
[-- Attachment #2: Type: text/html, Size: 2645 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-07 6:26 ` Bram Cohen
2022-03-07 22:56 ` ZmnSCPxj
@ 2022-03-08 1:27 ` Anthony Towns
2022-03-08 3:06 ` ZmnSCPxj
2022-03-09 2:54 ` Bram Cohen
1 sibling, 2 replies; 23+ messages in thread
From: Anthony Towns @ 2022-03-08 1:27 UTC (permalink / raw)
To: Bitcoin Protocol Discussion; +Cc: Bram Cohen
On Sun, Mar 06, 2022 at 10:26:47PM -0800, Bram Cohen via bitcoin-dev wrote:
> > After looking into it, I actually think chia lisp [1] gets pretty much all
> > the major design decisions pretty much right. There are obviously a few
> > changes needed given the differences in design between chia and bitcoin:
> Bitcoin uses the UTXO model as opposed to Chia's Coin Set model. While
> these are close enough that it's often explained as Chia uses the UTXO
> model but that isn't technically true. Relevant to the above comment is
> that in the UTXO model transactions get passed to a scriptpubkey and it
> either assert fails or it doesn't, while in the coin set model each puzzle
> (scriptpubkey) gets run and either assert fails or returns a list of extra
> conditions it has, possibly including timelocks and creating new coins,
> paying fees, and other things.
One way to match the way bitcoin do things, you could have the "list of
extra conditions" encoded explicitly in the transaction via the annex,
and then check the extra conditions when the script is executed.
> If you're doing everything from scratch it's cleaner to go with the coin
> set model, but retrofitting onto existing Bitcoin it may be best to leave
> the UTXO model intact and compensate by adding a bunch more opcodes which
> are special to parsing Bitcoin transactions. The transaction format itself
> can be mostly left alone but to enable some of the extra tricks (mostly
> implementing capabilities) it's probably a good idea to make new
> conventions for how a transaction can have advisory information which
> specifies which of the inputs to a transaction is the parent of a specific
> output and also info which is used for communication between the UTXOs in a
> transaction.
I think the parent/child coin relationship is only interesting when
"unrelated" spends can assert that the child coin is being created -- ie
things along the lines of the "transaction sponsorship" proposal. My
feeling is that complicates the mempool a bit much, so is best left for
later, if done at all.
(I think the hard part of managing the extra conditions is mostly
in keeping it efficient to manage the mempool and construct the most
profitable blocks/bundles, rather than where the data goes)
> But one could also make lisp-generated UTXOs be based off transactions
> which look completely trivial and have all their important information be
> stored separately in a new vbytes area. That works but results in a bit of
> a dual identity where some coins have both an old style id and a new style
> id which gunks up what
We've already got a txid and a wtxid, adding more ids seems best avoided
if possible...
> > Pretty much all the opcodes in the first section are directly from chia
> > lisp, while all the rest are to complete the "bitcoin" functionality.
> > The last two are extensions that are more food for thought than a real
> > proposal.
> Are you thinking of this as a completely alternative script format or an
> extension to bitcoin script?
As an alternative to tapscript, so when constructing the merkle tree of
scripts for a taproot address, you could have some of those scripts be
in tapscript as it exists today with OP_CHECKSIG etc, and others could
be in lisp. (You could then have an entirely lisp-based sub-merkle-tree
of lisp fragments via sha256tree or similar of course)
> They're radically different approaches and
> it's hard to see how they mix. Everything in lisp is completely sandboxed,
> and that functionality is important to a lot of things, and it's really
> normal to be given a reveal of a scriptpubkey and be able to rely on your
> parsing of it.
The above prevents combining puzzles/solutions from multiple coin spends,
but I don't think that's very attractive in bitcoin's context, the way
it is for chia. I don't think it loses much else?
> > There's two ways to think about upgradability here; if someday we want
> > to add new opcodes to the language -- perhaps something to validate zero
> > knowledge proofs or calculate sha3 or use a different ECC curve, or some
> > way to support cross-input signature aggregation, or perhaps it's just
> > that some snippets are very widely used and we'd like to code them in
> > C++ directly so they validate quicker and don't use up as much block
> > weight. One approach is to just define a new version of the language
> > via the tapleaf version, defining new opcodes however we like.
> A nice side benefit of sticking with the UTXO model is that the soft fork
> hook can be that all unknown opcodes make the entire thing automatically
> pass.
I don't think that works well if you want to allow the spender (the
puzzle solution) to be able to use opcodes introduced in a soft-fork
(eg, for graftroot-like behaviour)?
> Chia's approach to transaction fees is essentially identical to Bitcoin's
> although a lot fewer things in the ecosystem support fees due to a lack of
> having needed it yet. I don't think mempool issues have much to do with
> choice of scriptpubkey language. which is mostly about adding in covenants
> and capabilities.
Having third parties be able to link their spends to yours complicates
mempool behaviour a fair bit (see the discussions on pinning wrt
lightning txs -- and that's only with direct participants being able to
link transactions). But it's very much a second-order effect compared
to having fees being a meaningful thing at all. It took, what, six or
seven years for people to start actually using dynamic fees in bitcoin?
> I previously posted some thoughts about this here:
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-December/019722.html
I'm pretty skeptical about having a database of large script snippets
that will hopefully be reused in the future.
On Mon, Mar 07, 2022 at 10:56:38PM +0000, ZmnSCPxj via bitcoin-dev wrote:
> > while in the coin set model each puzzle (scriptpubkey) gets run and either assert fails or returns a list of extra conditions it has, possibly including timelocks and creating new coins, paying fees, and other things.
> Does this mean it basically gets recursive covenants?
In chia the "scriptPubKey" is the hash of a lisp program, and when you
create a new coin, the "scriptPubKey" of the newly generated coin is
also the output of a lisp program. So writing a quine gets you general
recursive covenants in a pretty straight forward way, as I understand it.
> > > - serialization seems to be a bit verbose -- 100kB of serialized clvm
> > > code from a random block gzips to 60kB; optimising the serialization
> > > for small lists, and perhaps also for small literal numbers might be
> > > a feasible improvement; though it's not clear to me how frequently
> > > serialization size would be the limiting factor for cost versus
> > > execution time or memory usage.
> > A lot of this is because there's a hook for doing compression at the consensus layer which isn't being used aggressively yet. That one has the downside that the combined cost of transactions can add up very nonlinearly, but when you have constantly repeated bits of large boilerplate it gets close and there isn't much of an alternative. That said even with that form of compression maxxed out it's likely that gzip could still do some compression but that would be better done in the database and in wire protocol formats rather than changing the format which is hashed at the consensus layer.
> How different is this from "jets" as proposed in Simplicity?
Rather than a "transaction" containing "inputs/outputs", chia has spend
bundles that spend and create coins; and spend bundles can be merged
together, so that a block only has a single spend bundle. That spend
bundle joins all the puzzles (the programs that, when hashed match
the scriptPubKey) and solutions (scriptSigs) for the coins being spent
together.
I /think/ the compression hook would be to allow you to have the puzzles
be (re)generated via another lisp program if that was more efficient
than just listing them out. But I assume it would be turtles, err,
lisp all the way down, no special C functions like with jets.
> > > Pretty much all the opcodes in the first section are directly from chia
> > > lisp, while all the rest are to complete the "bitcoin" functionality.
> > > The last two are extensions that are more food for thought than a real
> > > proposal.
> >
> > Are you thinking of this as a completely alternative script format or an extension to bitcoin script? They're radically different approaches and it's hard to see how they mix. Everything in lisp is completely sandboxed, and that functionality is important to a lot of things, and it's really normal to be given a reveal of a scriptpubkey and be able to rely on your parsing of it.
>
> I believe AJ is proposing a completely alternative format to OG Bitcoin SCRIPT.
> Basically, as I understand it, nothing in the design of Tapscript versions prevents us from completely changing the interpretation of Tapscript bytes, and use a completely different language.
> That is, we could designate a new Tapscript version as completely different from OG Bitcoin SCRIPT.
BIP342 defines tapscript, and it's selected by the taproot leaf version
0xc0; this hypothetical lispy "btcscript" might be selected via taproot
leaf version 0xc2 or similar (elements' tapscript variant is selected
via version 0xc4).
Cheers,
aj
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-08 1:27 ` Anthony Towns
@ 2022-03-08 3:06 ` ZmnSCPxj
2022-03-09 3:07 ` Bram Cohen
2022-03-11 4:46 ` Anthony Towns
2022-03-09 2:54 ` Bram Cohen
1 sibling, 2 replies; 23+ messages in thread
From: ZmnSCPxj @ 2022-03-08 3:06 UTC (permalink / raw)
To: Anthony Towns; +Cc: Bitcoin Protocol Discussion, Bram Cohen
Good morning aj et al.,
> > They're radically different approaches and
> > it's hard to see how they mix. Everything in lisp is completely sandboxed,
> > and that functionality is important to a lot of things, and it's really
> > normal to be given a reveal of a scriptpubkey and be able to rely on your
> > parsing of it.
>
> The above prevents combining puzzles/solutions from multiple coin spends,
> but I don't think that's very attractive in bitcoin's context, the way
> it is for chia. I don't think it loses much else?
But cross-input signature aggregation is a nice-to-have we want for Bitcoin, and, to me, cross-input sigagg is not much different from cross-input puzzle/solution compression.
For example you might have multiple HTLCs, with mostly the same code except for details like who the acceptor and offerrer are, exact hash, and timelock, and you could claim multiple HTLCs in a single tx and feed the details separately but the code for the HTLC is common to all of the HTLCs.
You do not even need to come from the same protocol if multiple protocols use the same code for implementing HTLC.
> > > There's two ways to think about upgradability here; if someday we want
> > > to add new opcodes to the language -- perhaps something to validate zero
> > > knowledge proofs or calculate sha3 or use a different ECC curve, or some
> > > way to support cross-input signature aggregation, or perhaps it's just
> > > that some snippets are very widely used and we'd like to code them in
> > > C++ directly so they validate quicker and don't use up as much block
> > > weight. One approach is to just define a new version of the language
> > > via the tapleaf version, defining new opcodes however we like.
> > > A nice side benefit of sticking with the UTXO model is that the soft fork
> > > hook can be that all unknown opcodes make the entire thing automatically
> > > pass.
>
> I don't think that works well if you want to allow the spender (the
> puzzle solution) to be able to use opcodes introduced in a soft-fork
> (eg, for graftroot-like behaviour)?
This does not apply to current Bitcoin since we no longer accept a SCRIPT from the spender, we now have a witness stack.
However, once we introduce opcodes that allow recursive covenants, it seems this is now a potential footgun if the spender can tell the puzzle SCRIPT to load some code that will then be used in the *next* UTXO created, and *then* the spender can claim it.
Hmmm.... Or maybe not?
If the spender can already tell the puzzle SCRIPT to send the funds to a SCRIPT that is controlled by the spender, the spender can already tell the puzzle SCRIPT to forward the funds to a pubkey the spender controls.
So this seems to be more like "do not write broken SCRIPTs"?
> > > > - serialization seems to be a bit verbose -- 100kB of serialized clvm
> > > > code from a random block gzips to 60kB; optimising the serialization
> > > > for small lists, and perhaps also for small literal numbers might be
> > > > a feasible improvement; though it's not clear to me how frequently
> > > > serialization size would be the limiting factor for cost versus
> > > > execution time or memory usage.
> > > > A lot of this is because there's a hook for doing compression at the consensus layer which isn't being used aggressively yet. That one has the downside that the combined cost of transactions can add up very nonlinearly, but when you have constantly repeated bits of large boilerplate it gets close and there isn't much of an alternative. That said even with that form of compression maxxed out it's likely that gzip could still do some compression but that would be better done in the database and in wire protocol formats rather than changing the format which is hashed at the consensus layer.
> > > > How different is this from "jets" as proposed in Simplicity?
>
> Rather than a "transaction" containing "inputs/outputs", chia has spend
> bundles that spend and create coins; and spend bundles can be merged
> together, so that a block only has a single spend bundle. That spend
> bundle joins all the puzzles (the programs that, when hashed match
> the scriptPubKey) and solutions (scriptSigs) for the coins being spent
> together.
>
> I /think/ the compression hook would be to allow you to have the puzzles
> be (re)generated via another lisp program if that was more efficient
> than just listing them out. But I assume it would be turtles, err,
> lisp all the way down, no special C functions like with jets.
Eh, you could use Common LISP or a recent-enough RnRS Scheme to write a cryptocurrency node software, so "special C function" seems to overprivilege C...
I suppose the more proper way to think of this is that jets are *equivalent to* some code in the hosted language, and have an *efficient implementation* in the hosting language.
In this view, the current OG Bitcoin SCRIPT is the hosted language, and the C++ Bitcoin Core interpreter for it is in the hosting language C++.
LISP can be both the hosted and hosting language because it is easy to implement `eval` in LISP and you can write macros which transform small code snippets into larger code.
LISP code generating *more* LISP code is pretty much what macros are.
Regards,
ZmnSCPxj
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-08 3:06 ` ZmnSCPxj
@ 2022-03-09 3:07 ` Bram Cohen
2022-03-09 14:30 ` ZmnSCPxj
2022-03-11 4:46 ` Anthony Towns
1 sibling, 1 reply; 23+ messages in thread
From: Bram Cohen @ 2022-03-09 3:07 UTC (permalink / raw)
To: ZmnSCPxj; +Cc: Bitcoin Protocol Discussion, Anthony Towns
[-- Attachment #1: Type: text/plain, Size: 1963 bytes --]
On Mon, Mar 7, 2022 at 7:06 PM ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote:
>
> But cross-input signature aggregation is a nice-to-have we want for
> Bitcoin, and, to me, cross-input sigagg is not much different from
> cross-input puzzle/solution compression.
>
Cross-input signature aggregation has a lot of headaches unless you're
using BLS signatures, in which case you always aggregate everything all the
time because it can be done after the fact noninteractively. In that case
it makes sense to have a special aggregated signature which always comes
with a transaction or block. But it might be a bit much to bundle both lisp
and BLS support into one big glop.
>
> For example you might have multiple HTLCs, with mostly the same code
> except for details like who the acceptor and offerrer are, exact hash, and
> timelock, and you could claim multiple HTLCs in a single tx and feed the
> details separately but the code for the HTLC is common to all of the HTLCs.
> You do not even need to come from the same protocol if multiple protocols
> use the same code for implementing HTLC.
>
HTLCs, at least in Chia, have embarrassingly little code in them. Like, so
little that there's almost nothing to compress.
> This does not apply to current Bitcoin since we no longer accept a SCRIPT
> from the spender, we now have a witness stack.
>
My mental model of Bitcoin is to pretend that segwit was always there and
the separation of different sections of data is a semantic quibble.
> So this seems to be more like "do not write broken SCRIPTs"?
>
In general if people footgun that's their own fault. The resistance to
covenants and capabilities in the past has largely been around what would
happen if you had opt-out covenants which acted as riders and could monkey
around in later spends which were none of their business. But if they're
fully baked into the scriptpubkey then they're opted into by the recipient
and there aren't any weird surprises.
[-- Attachment #2: Type: text/html, Size: 2813 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-09 3:07 ` Bram Cohen
@ 2022-03-09 14:30 ` ZmnSCPxj
2022-03-16 6:40 ` Bram Cohen
0 siblings, 1 reply; 23+ messages in thread
From: ZmnSCPxj @ 2022-03-09 14:30 UTC (permalink / raw)
To: Bram Cohen; +Cc: Bitcoin Protocol Discussion, Anthony Towns
Good morning Bram,
> On Mon, Mar 7, 2022 at 7:06 PM ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote:
>
> > But cross-input signature aggregation is a nice-to-have we want for Bitcoin, and, to me, cross-input sigagg is not much different from cross-input puzzle/solution compression.
>
> Cross-input signature aggregation has a lot of headaches unless you're using BLS signatures, in which case you always aggregate everything all the time because it can be done after the fact noninteractively. In that case it makes sense to have a special aggregated signature which always comes with a transaction or block. But it might be a bit much to bundle both lisp and BLS support into one big glop.
You misunderstand my point.
I am not saying "we should add sigagg and lisp together!"
I am pointing out that:
* We want to save bytes by having multiple inputs of a transaction use the same single signature (i.e. sigagg).
is not much different from:
* We want to save bytes by having multiple inputs of a transaction use the same `scriptPubKey` template.
> > For example you might have multiple HTLCs, with mostly the same code except for details like who the acceptor and offerrer are, exact hash, and timelock, and you could claim multiple HTLCs in a single tx and feed the details separately but the code for the HTLC is common to all of the HTLCs.
> > You do not even need to come from the same protocol if multiple protocols use the same code for implementing HTLC.
>
> HTLCs, at least in Chia, have embarrassingly little code in them. Like, so little that there's almost nothing to compress.
In Bitcoin at least an HTLC has, if you remove the `OP_PUSH`es, by my count, 13 bytes.
If you have a bunch of HTLCs you want to claim, you can reduce your witness data by 13 bytes minus whatever number of bytes you need to indicate this.
That amounts to about 3 vbytes per HTLC, which can be significant enough to be worth it (consider that Taproot moving away from encoded signatures saves only 9 weight units per signature, i.e. about 2 vbytes).
Do note that PTLCs remain more space-efficient though, so forget about HTLCs and just use PTLCs.
>
> > This does not apply to current Bitcoin since we no longer accept a SCRIPT from the spender, we now have a witness stack.
>
> My mental model of Bitcoin is to pretend that segwit was always there and the separation of different sections of data is a semantic quibble.
This is not a semantic quibble --- `witness` contains only the equivalent of `OP_PUSH`es, while `scriptSig` can in theory contain non-`OP_PUSH` opcodes.
xref. `1 RETURN`.
As-is, with SegWit the spender no longer is able to provide any SCRIPT at all, but new opcodes may allow the spender to effectively inject any SCRIPT they want, once again, because `witness` data may now become code.
> But if they're fully baked into the scriptpubkey then they're opted into by the recipient and there aren't any weird surprises.
This is really what I kinda object to.
Yes, "buyer beware", but consider that as the covenant complexity increases, the probability of bugs, intentional or not, sneaking in, increases as well.
And a bug is really "a weird surprise" --- xref TheDAO incident.
This makes me kinda wary of using such covenant features at all, and if stuff like `SIGHASH_ANYPREVOUT` or `OP_CHECKTEMPLATEVERIFY` are not added but must be reimplemented via a covenant feature, I would be saddened, as I now have to contend with the complexity of covenant features and carefully check that `SIGHASH_ANYPREVOUT`/`OP_CHECKTEMPLATEVERIFY` were implemented correctly.
True I also still have to check the C++ source code if they are implemented directly as opcodes, but I can read C++ better than frikkin Bitcoin SCRIPT.
Not to mention that I now have to review both the (more complicated due to more general) covenant feature implementation, *and* the implementation of `SIGHASH_ANYPREVOUT`/`OP_CHECKTEMPLATEVERIFY` in terms of the covenant feature.
Regards,
ZmnSCPxj
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-09 14:30 ` ZmnSCPxj
@ 2022-03-16 6:40 ` Bram Cohen
2022-03-16 15:09 ` ZmnSCPxj
0 siblings, 1 reply; 23+ messages in thread
From: Bram Cohen @ 2022-03-16 6:40 UTC (permalink / raw)
To: ZmnSCPxj; +Cc: Bitcoin Protocol Discussion, Anthony Towns
[-- Attachment #1: Type: text/plain, Size: 3479 bytes --]
On Wed, Mar 9, 2022 at 6:30 AM ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote:
> I am pointing out that:
>
> * We want to save bytes by having multiple inputs of a transaction use the
> same single signature (i.e. sigagg).
>
> is not much different from:
>
> * We want to save bytes by having multiple inputs of a transaction use the
> same `scriptPubKey` template.
>
Fair point. In the past Bitcoin has been resistant to such things because
for example reusing pubkeys can save you from having to separately pay for
the reveals of all of them but letting people get credit for that
incentivizes key reuse which isn't such a great thing.
>
> > > For example you might have multiple HTLCs, with mostly the same code
> except for details like who the acceptor and offerrer are, exact hash, and
> timelock, and you could claim multiple HTLCs in a single tx and feed the
> details separately but the code for the HTLC is common to all of the HTLCs.
> > > You do not even need to come from the same protocol if multiple
> protocols use the same code for implementing HTLC.
> >
> > HTLCs, at least in Chia, have embarrassingly little code in them. Like,
> so little that there's almost nothing to compress.
>
> In Bitcoin at least an HTLC has, if you remove the `OP_PUSH`es, by my
> count, 13 bytes.
> If you have a bunch of HTLCs you want to claim, you can reduce your
> witness data by 13 bytes minus whatever number of bytes you need to
> indicate this.
> That amounts to about 3 vbytes per HTLC, which can be significant enough
> to be worth it (consider that Taproot moving away from encoded signatures
> saves only 9 weight units per signature, i.e. about 2 vbytes).
>
Oh I see. That's already extremely small overhead. When you start
optimizing at that level you wind up doing things like pulling all the
HTLCs into the same block to take the overhead of pulling in the template
only once.
>
> Do note that PTLCs remain more space-efficient though, so forget about
> HTLCs and just use PTLCs.
>
It makes a lot of sense to make a payment channel system using PTLCs and
eltoo right off the bat but then you wind up rewriting everything from
scratch.
> > > This does not apply to current Bitcoin since we no longer accept a
> SCRIPT from the spender, we now have a witness stack.
> >
> > My mental model of Bitcoin is to pretend that segwit was always there
> and the separation of different sections of data is a semantic quibble.
>
> This is not a semantic quibble --- `witness` contains only the equivalent
> of `OP_PUSH`es, while `scriptSig` can in theory contain non-`OP_PUSH`
> opcodes.
> xref. `1 RETURN`.
>
It's very normal when you're using lisp for snippets of code to be passed
in as data and then verified and executed. That's enabled by the extreme
adherence to no side effects.
> This makes me kinda wary of using such covenant features at all, and if
> stuff like `SIGHASH_ANYPREVOUT` or `OP_CHECKTEMPLATEVERIFY` are not added
> but must be reimplemented via a covenant feature, I would be saddened, as I
> now have to contend with the complexity of covenant features and carefully
> check that `SIGHASH_ANYPREVOUT`/`OP_CHECKTEMPLATEVERIFY` were implemented
> correctly.
>
Even the 'standard format' transaction which supports taproot and graftroot
is implemented in CLVM. The benefit of this approach is that new
functionality can be implemented and deployed immediately rather than
having to painstakingly go through a soft fork deployment for each thing.
[-- Attachment #2: Type: text/html, Size: 4541 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-16 6:40 ` Bram Cohen
@ 2022-03-16 15:09 ` ZmnSCPxj
0 siblings, 0 replies; 23+ messages in thread
From: ZmnSCPxj @ 2022-03-16 15:09 UTC (permalink / raw)
To: Bram Cohen; +Cc: Bitcoin Protocol Discussion, Anthony Towns
Good morning Bram,
> On Wed, Mar 9, 2022 at 6:30 AM ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote:
>
> > I am pointing out that:
> >
> > * We want to save bytes by having multiple inputs of a transaction use the same single signature (i.e. sigagg).
> >
> > is not much different from:
> >
> > * We want to save bytes by having multiple inputs of a transaction use the same `scriptPubKey` template.
>
> Fair point. In the past Bitcoin has been resistant to such things because for example reusing pubkeys can save you from having to separately pay for the reveals of all of them but letting people get credit for that incentivizes key reuse which isn't such a great thing.
See paragraph below:
> > > > For example you might have multiple HTLCs, with mostly the same code except for details like who the acceptor and offerrer are, exact hash, and timelock, and you could claim multiple HTLCs in a single tx and feed the details separately but the code for the HTLC is common to all of the HTLCs.
> > > > You do not even need to come from the same protocol if multiple protocols use the same code for implementing HTLC.
Note that the acceptor and offerrer are represented by pubkeys here.
So we do not want to encourage key reuse, we want to encourage reuse of *how* the pubkeys are used (but rotate the pubkeys).
In the other thread on Jets in bitcoin-dev I proposed moving data like pubkeys into a separate part of the SCRIPT in order to (1) not encourage key reuse and (2) make it easier to compress the code.
In LISP terms, it would be like requiring that top-level code have a `(let ...)` form around it where the assigned data *must* be constants or `quote`, and disallowing constants and `quote` elsewhere, then any generated LISP code has to execute in the same top-level environment defined by this top-level `let`.
So you can compress the code by using some metaprogramming where LISP generates LISP code but you still need to work within the confines of the available constants.
> > > HTLCs, at least in Chia, have embarrassingly little code in them. Like, so little that there's almost nothing to compress.
> >
> > In Bitcoin at least an HTLC has, if you remove the `OP_PUSH`es, by my count, 13 bytes.
> > If you have a bunch of HTLCs you want to claim, you can reduce your witness data by 13 bytes minus whatever number of bytes you need to indicate this.
> > That amounts to about 3 vbytes per HTLC, which can be significant enough to be worth it (consider that Taproot moving away from encoded signatures saves only 9 weight units per signature, i.e. about 2 vbytes).
>
> Oh I see. That's already extremely small overhead. When you start optimizing at that level you wind up doing things like pulling all the HTLCs into the same block to take the overhead of pulling in the template only once.
>
>
> > Do note that PTLCs remain more space-efficient though, so forget about HTLCs and just use PTLCs.
>
> It makes a lot of sense to make a payment channel system using PTLCs and eltoo right off the bat but then you wind up rewriting everything from scratch.
Bunch of #reckless devs implemented Lightning with just HTLCs so that is that, *shrug*, gotta wonder what those people were thinking, not waiting for PTLCs.
>
>
> > > > This does not apply to current Bitcoin since we no longer accept a SCRIPT from the spender, we now have a witness stack.
> > >
> > > My mental model of Bitcoin is to pretend that segwit was always there and the separation of different sections of data is a semantic quibble.
> >
> > This is not a semantic quibble --- `witness` contains only the equivalent of `OP_PUSH`es, while `scriptSig` can in theory contain non-`OP_PUSH` opcodes.
> > xref. `1 RETURN`.
>
> It's very normal when you're using lisp for snippets of code to be passed in as data and then verified and executed. That's enabled by the extreme adherence to no side effects.
Quining still allows Turing-completeness and infinite loops, which *is* still a side effect, though as I understand it ChiaLISP uses the "Turing-complete but with a max number of ops" kind of totality.
> > This makes me kinda wary of using such covenant features at all, and if stuff like `SIGHASH_ANYPREVOUT` or `OP_CHECKTEMPLATEVERIFY` are not added but must be reimplemented via a covenant feature, I would be saddened, as I now have to contend with the complexity of covenant features and carefully check that `SIGHASH_ANYPREVOUT`/`OP_CHECKTEMPLATEVERIFY` were implemented correctly.
>
> Even the 'standard format' transaction which supports taproot and graftroot is implemented in CLVM. The benefit of this approach is that new functionality can be implemented and deployed immediately rather than having to painstakingly go through a soft fork deployment for each thing.
Wow, just wow.
Regards,
ZmnSCPxj
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-08 3:06 ` ZmnSCPxj
2022-03-09 3:07 ` Bram Cohen
@ 2022-03-11 4:46 ` Anthony Towns
2022-03-16 6:52 ` Bram Cohen
2022-03-16 14:54 ` ZmnSCPxj
1 sibling, 2 replies; 23+ messages in thread
From: Anthony Towns @ 2022-03-11 4:46 UTC (permalink / raw)
To: ZmnSCPxj, Bitcoin Protocol Discussion; +Cc: Bram Cohen
On Tue, Mar 08, 2022 at 03:06:43AM +0000, ZmnSCPxj via bitcoin-dev wrote:
> > > They're radically different approaches and
> > > it's hard to see how they mix. Everything in lisp is completely sandboxed,
> > > and that functionality is important to a lot of things, and it's really
> > > normal to be given a reveal of a scriptpubkey and be able to rely on your
> > > parsing of it.
> > The above prevents combining puzzles/solutions from multiple coin spends,
> > but I don't think that's very attractive in bitcoin's context, the way
> > it is for chia. I don't think it loses much else?
> But cross-input signature aggregation is a nice-to-have we want for Bitcoin, and, to me, cross-input sigagg is not much different from cross-input puzzle/solution compression.
Signature aggregation has a lot more maths and crypto involved than
reversible compression of puzzles/solutions. I was more meaning
cross-transaction relationships rather than cross-input ones though.
> > I /think/ the compression hook would be to allow you to have the puzzles
> > be (re)generated via another lisp program if that was more efficient
> > than just listing them out. But I assume it would be turtles, err,
> > lisp all the way down, no special C functions like with jets.
> Eh, you could use Common LISP or a recent-enough RnRS Scheme to write a cryptocurrency node software, so "special C function" seems to overprivilege C...
Jets are "special" in so far as they are costed differently at the
consensus level than the equivalent pure/jetless simplicity code that
they replace. Whether they're written in C or something else isn't the
important part.
By comparison, generating lisp code with lisp code in chia doesn't get
special treatment.
(You *could* also use jets in a way that doesn't impact consensus just
to make your node software more efficient in the normal case -- perhaps
via a JIT compiler that sees common expressions in the blockchain and
optimises them eg)
On Wed, Mar 09, 2022 at 02:30:34PM +0000, ZmnSCPxj via bitcoin-dev wrote:
> Do note that PTLCs remain more space-efficient though, so forget about HTLCs and just use PTLCs.
Note that PTLCs aren't really Chia-friendly, both because chia doesn't
have secp256k1 operations in the first place, but also because you can't
do a scriptless-script because the information you need to extract
is lost when signatures are non-interactively aggregated via BLS --
so that adds an expensive extra ECC operation rather than reusing an
op you're already paying for (scriptless script PTLCs) or just adding
a cheap hash operation (HTLCs).
(Pretty sure Chia could do (= PTLC (pubkey_for_exp PREIMAGE)) for
preimage reveal of BLS PTLCs, but that wouldn't be compatible with
bitcoin secp256k1 PTLCs. You could sha256 the PTLC to save a few bytes,
but I think given how much a sha256 opcode costs in Chia, that that
would actually be more expensive?)
None of that applies to a bitcoin implementation that doesn't switch to
BLS signatures though.
> > But if they're fully baked into the scriptpubkey then they're opted into by the recipient and there aren't any weird surprises.
> This is really what I kinda object to.
> Yes, "buyer beware", but consider that as the covenant complexity increases, the probability of bugs, intentional or not, sneaking in, increases as well.
> And a bug is really "a weird surprise" --- xref TheDAO incident.
Which is better: a bug in the complicated script code specified for
implementing eltoo in a BOLT; or a bug in the BIP/implementation of a
new sighash feature designed to make it easy to implement eltoo, that's
been soft-forked into consensus?
Seems to me, that it's always better to have the bug be at the wallet
level, since that can be fixed by upgrading individual wallet software.
> This makes me kinda wary of using such covenant features at all, and if stuff like `SIGHASH_ANYPREVOUT` or `OP_CHECKTEMPLATEVERIFY` are not added but must be reimplemented via a covenant feature, I would be saddened, as I now have to contend with the complexity of covenant features and carefully check that `SIGHASH_ANYPREVOUT`/`OP_CHECKTEMPLATEVERIFY` were implemented correctly.
> True I also still have to check the C++ source code if they are implemented directly as opcodes, but I can read C++ better than frikkin Bitcoin SCRIPT.
If OP_CHECKTEMPLATEVERIFY (etc) is implemented as a consensus update, you
probably want to review the C++ code even if you're not going to use it,
just to make sure consensus doesn't end up broken as a result. Whereas if
it's only used by other people's wallets, you might be able to ignore it
entirely (at least until it becomes so common that any bugs might allow
a significant fraction of BTC to be stolen/lost and indirectly cause a
systemic risk).
> Not to mention that I now have to review both the (more complicated due to more general) covenant feature implementation, *and* the implementation of `SIGHASH_ANYPREVOUT`/`OP_CHECKTEMPLATEVERIFY` in terms of the covenant feature.
I'm not sure that a "covenant language implementation" would necessarily
be "that" complicated. And if so, having a DSL for covenants could,
at least in theory, make for a much simpler implementation of
ANYPREVOUT/CTV/TLUV/EVICT/etc than doing it directly in C++, which
might mean those things are less likely to have "weird surprises" rather
than more.
Cheers,
aj
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-11 4:46 ` Anthony Towns
@ 2022-03-16 6:52 ` Bram Cohen
2022-03-16 14:54 ` ZmnSCPxj
1 sibling, 0 replies; 23+ messages in thread
From: Bram Cohen @ 2022-03-16 6:52 UTC (permalink / raw)
To: Anthony Towns; +Cc: Bitcoin Protocol Discussion
[-- Attachment #1: Type: text/plain, Size: 854 bytes --]
On Thu, Mar 10, 2022 at 8:46 PM Anthony Towns <aj@erisian.com.au> wrote:
> Note that PTLCs aren't really Chia-friendly, both because chia doesn't
> have secp256k1 operations in the first place, but also because you can't
> do a scriptless-script because the information you need to extract
> is lost when signatures are non-interactively aggregated via BLS --
> so that adds an expensive extra ECC operation rather than reusing an
> op you're already paying for (scriptless script PTLCs) or just adding
> a cheap hash operation (HTLCs).
>
The CLVM currently supports BLS12-381 group 1 point operations which it
uses to support taproot which I think is enough to support PTLCs but
obviously isn't compatible with secp. In the future there will likely be a
soft fork to include a complete set of BLS12-381 operations mostly to
support ZK implementation.
[-- Attachment #2: Type: text/html, Size: 1187 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-11 4:46 ` Anthony Towns
2022-03-16 6:52 ` Bram Cohen
@ 2022-03-16 14:54 ` ZmnSCPxj
2022-03-19 17:34 ` Bram Cohen
2022-03-22 23:37 ` Anthony Towns
1 sibling, 2 replies; 23+ messages in thread
From: ZmnSCPxj @ 2022-03-16 14:54 UTC (permalink / raw)
To: Anthony Towns; +Cc: Bitcoin Protocol Discussion, Bram Cohen
Good morning aj et al.,
> On Tue, Mar 08, 2022 at 03:06:43AM +0000, ZmnSCPxj via bitcoin-dev wrote:
>
> > > > They're radically different approaches and
> > > > it's hard to see how they mix. Everything in lisp is completely sandboxed,
> > > > and that functionality is important to a lot of things, and it's really
> > > > normal to be given a reveal of a scriptpubkey and be able to rely on your
> > > > parsing of it.
> > > > The above prevents combining puzzles/solutions from multiple coin spends,
> > > > but I don't think that's very attractive in bitcoin's context, the way
> > > > it is for chia. I don't think it loses much else?
> > > > But cross-input signature aggregation is a nice-to-have we want for Bitcoin, and, to me, cross-input sigagg is not much different from cross-input puzzle/solution compression.
>
> Signature aggregation has a lot more maths and crypto involved than
> reversible compression of puzzles/solutions. I was more meaning
> cross-transaction relationships rather than cross-input ones though.
My point is that in the past we were willing to discuss the complicated crypto math around cross-input sigagg in order to save bytes, so it seems to me that cross-input compression of puzzles/solutions at least merits a discussion, since it would require a lot less heavy crypto math, and *also* save bytes.
> > > I /think/ the compression hook would be to allow you to have the puzzles
> > > be (re)generated via another lisp program if that was more efficient
> > > than just listing them out. But I assume it would be turtles, err,
> > > lisp all the way down, no special C functions like with jets.
> > > Eh, you could use Common LISP or a recent-enough RnRS Scheme to write a cryptocurrency node software, so "special C function" seems to overprivilege C...
>
> Jets are "special" in so far as they are costed differently at the
> consensus level than the equivalent pure/jetless simplicity code that
> they replace. Whether they're written in C or something else isn't the
> important part.
>
> By comparison, generating lisp code with lisp code in chia doesn't get
> special treatment.
Hmm, what exactly do you mean here?
If I have a shorter piece of code that expands to a larger piece of code because metaprogramming, is it considered the same cost as the larger piece of code (even if not all parts of the larger piece of code are executed, e.g. branches)?
Or is the cost simply proportional to the number of operations actually executed?
I think there are two costs here:
* Cost of bytes to transmit over the network.
* Cost of CPU load.
Over here in Bitcoin we have been mostly conflating the two, to the point that Taproot even eliminates unexecuted branches from being transmitted over the network so that bytes transmitted is approximately equal to opcodes executed.
It seems to me that lisp-generating-lisp compression would reduce the cost of bytes transmitted, but increase the CPU load (first the metaprogram runs, and *then* the produced program runs).
> (You could also use jets in a way that doesn't impact consensus just
> to make your node software more efficient in the normal case -- perhaps
> via a JIT compiler that sees common expressions in the blockchain and
> optimises them eg)
I believe that is relevant in the other thread about Jets that I and Billy forked off from `OP_FOLD`?
Over in that thread, we seem to have largely split jets into two types:
* Consensus-critical jets which need a softfork but reduce the weight of the jetted code (and which are invisible to pre-softfork nodes).
* Non-consensus-critical jets which only need relay change and reduces bytes sent, but keeps the weight of the jetted code.
It seems to me that lisp-generating-lisp compression would roughly fall into the "non-consensus-critical jets", roughly.
> On Wed, Mar 09, 2022 at 02:30:34PM +0000, ZmnSCPxj via bitcoin-dev wrote:
>
> > Do note that PTLCs remain more space-efficient though, so forget about HTLCs and just use PTLCs.
>
> Note that PTLCs aren't really Chia-friendly, both because chia doesn't
> have secp256k1 operations in the first place, but also because you can't
> do a scriptless-script because the information you need to extract
> is lost when signatures are non-interactively aggregated via BLS --
> so that adds an expensive extra ECC operation rather than reusing an
> op you're already paying for (scriptless script PTLCs) or just adding
> a cheap hash operation (HTLCs).
>
> (Pretty sure Chia could do (= PTLC (pubkey_for_exp PREIMAGE)) for
> preimage reveal of BLS PTLCs, but that wouldn't be compatible with
> bitcoin secp256k1 PTLCs. You could sha256 the PTLC to save a few bytes,
> but I think given how much a sha256 opcode costs in Chia, that that
> would actually be more expensive?)
>
> None of that applies to a bitcoin implementation that doesn't switch to
> BLS signatures though.
Not being a mathist, I have absolutely no idea, but: at least as I understood from the original mimblewimble.txt from Voldemort, BLS signatures had an additional assumption, which I *think* means "theoretically less secure than SECP256K1 Schnorr / ECDSA".
Is my understanding correct?
And if so, how theoretical would that be?
PTLC signatures have the very nice property of being indistinguishable from non-PTLC signatures to anyone without the adaptor, and I think privacy-by-default should be what we encourage.
> > > But if they're fully baked into the scriptpubkey then they're opted into by the recipient and there aren't any weird surprises.
> > > This is really what I kinda object to.
> > > Yes, "buyer beware", but consider that as the covenant complexity increases, the probability of bugs, intentional or not, sneaking in, increases as well.
> > > And a bug is really "a weird surprise" --- xref TheDAO incident.
>
> Which is better: a bug in the complicated script code specified for
> implementing eltoo in a BOLT; or a bug in the BIP/implementation of a
> new sighash feature designed to make it easy to implement eltoo, that's
> been soft-forked into consensus?
>
> Seems to me, that it's always better to have the bug be at the wallet
> level, since that can be fixed by upgrading individual wallet software.
Good point.
Though I should note that BIP-118 was originally proposed with a 5-line patch, so ---
> I'm not sure that a "covenant language implementation" would necessarily
> be "that" complicated. And if so, having a DSL for covenants could,
> at least in theory, make for a much simpler implementation of
> ANYPREVOUT/CTV/TLUV/EVICT/etc than doing it directly in C++, which
> might mean those things are less likely to have "weird surprises" rather
> than more.
<rant>
DSLs?
Domain-specific languages?
Do you know how many people hate autoconf?
That is because autoconf is secretly an embedded DSL in a really obscure language called `m4`.
Some of the `autoconf` weirdnesses are due precisely to having to hack `m4` to make it look nicer, like that weird rule to use double `[[` and `]]` quotes around sections of program source code.
Yes, it means we can have a nice `autoconf-archive`, but the actual code inside that archive?
People *making*`autoconf` macros have to learn both `m4` and the existing `autoconf` macro ecosystem.
Then there is BluespecSV.
Bluespec used to be an embedded DSL inside Haskell.
Nobody wanted it because they had to learn *two* languages, Haskell, and Bluespec.
Eventually they created BluespecSV, which was a language with completely separate grammar and tokens from Haskell, instead of embedded in it, and Bluespec was finally *actually* used in production.
But the damage was done: people who do digital hardware design tend to *bristle* when they hear the word "Haskell", because of all the horrible embedded DSLs in Haskell (Bluespec was just one, but I have heard of a few others which never managed to jump from being more than a lab toy, including a cute one where the on-FPGA layout of the circuit was part of the construction of the circuit description).
Embedded DSLs are cute, but they require learning two languages, not a single new one.
Just say no to embedded DSLs!
</rant>
Ah, much better.
This seems to me to be not much different from adding a separate compiler, which translates from the surface language to the underlying opcode/lisp language, with similar risks: now you have *another* bit of indirection to audit.
It feels like building a perpetual-motion machine, where we keep adding more stuff in the hope of reducing the complexity.
Regards,
ZmnSCPxj
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-16 14:54 ` ZmnSCPxj
@ 2022-03-19 17:34 ` Bram Cohen
2022-03-22 23:37 ` Anthony Towns
1 sibling, 0 replies; 23+ messages in thread
From: Bram Cohen @ 2022-03-19 17:34 UTC (permalink / raw)
To: ZmnSCPxj; +Cc: Bitcoin Protocol Discussion, Anthony Towns
[-- Attachment #1: Type: text/plain, Size: 2967 bytes --]
On Wed, Mar 16, 2022 at 7:54 AM ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote:
> My point is that in the past we were willing to discuss the complicated
> crypto math around cross-input sigagg in order to save bytes, so it seems
> to me that cross-input compression of puzzles/solutions at least merits a
> discussion, since it would require a lot less heavy crypto math, and *also*
> save bytes.
>
When using BLS signatures all that math is much simpler. You use a single
aggregated signature and always aggregate everything all the time.
I think there are two costs here:
>
> * Cost of bytes to transmit over the network.
> * Cost of CPU load.
>
There are three potential costs: CPU, bytes, and making outputs. In Chia
it's balanced so that the costs to a standard transaction in all three
buckets are roughly the same. In Bitcoin the three are implicitly tied to
each other by design which makes vbytes work okayish for Bitcoin Script as
it exists today.
> It seems to me that lisp-generating-lisp compression would reduce the cost
> of bytes transmitted, but increase the CPU load (first the metaprogram
> runs, and *then* the produced program runs).
>
Nah, CPU costs are dominated by signatures. Simple operations like applying
some parameters to a template don't add much.
> Not being a mathist, I have absolutely no idea, but: at least as I
> understood from the original mimblewimble.txt from Voldemort, BLS
> signatures had an additional assumption, which I *think* means
> "theoretically less secure than SECP256K1 Schnorr / ECDSA".
> Is my understanding correct?
> And if so, how theoretical would that be?
>
It includes some an extra cryptographic assumption but it's extremely
theoretical, having more to do with guessing what size of group provides
comparable security in number of bits than whether the whole approach is in
question. BLS12-381 is fairly conservative.
>
> PTLC signatures have the very nice property of being indistinguishable
> from non-PTLC signatures to anyone without the adaptor, and I think
> privacy-by-default should be what we encourage.
>
You do lose out on that when you aggregate.
> > I'm not sure that a "covenant language implementation" would necessarily
> > be "that" complicated. And if so, having a DSL for covenants could,
> > at least in theory, make for a much simpler implementation of
> > ANYPREVOUT/CTV/TLUV/EVICT/etc than doing it directly in C++, which
> > might mean those things are less likely to have "weird surprises" rather
> > than more.
>
> <rant>
> DSLs?
> Domain-specific languages?
>
Bitcoin Script is already a domain specific language, and the point of
adding in a lisp-family language would be to make it so that covenants and
capabilities can be implemented in the same language as is used for regular
coin scripting. The idea is to get off the treadmill of soft forking in
language features every time new functionality is wanted and make it
possible to implement all that on chain.
[-- Attachment #2: Type: text/html, Size: 4207 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-16 14:54 ` ZmnSCPxj
2022-03-19 17:34 ` Bram Cohen
@ 2022-03-22 23:37 ` Anthony Towns
1 sibling, 0 replies; 23+ messages in thread
From: Anthony Towns @ 2022-03-22 23:37 UTC (permalink / raw)
To: ZmnSCPxj, Bitcoin Protocol Discussion; +Cc: Bram Cohen
On Wed, Mar 16, 2022 at 02:54:05PM +0000, ZmnSCPxj via bitcoin-dev wrote:
> My point is that in the past we were willing to discuss the complicated crypto math around cross-input sigagg in order to save bytes, so it seems to me that cross-input compression of puzzles/solutions at least merits a discussion, since it would require a lot less heavy crypto math, and *also* save bytes.
Maybe it would be; but it's not something I was intending to bring up in
this thread.
Chia allows any coin spend to reference any output created in the
same block, and potentially any other input in the same block, and
automatically aggregates all signatures in a block; that's all pretty
neat, but trying to do all that in bitcoin in step one doesn't seem smart.
> > > > I /think/ the compression hook would be to allow you to have the puzzles
> > > > be (re)generated via another lisp program if that was more efficient
> > > > than just listing them out. But I assume it would be turtles, err,
> > > > lisp all the way down, no special C functions like with jets.
> > > > Eh, you could use Common LISP or a recent-enough RnRS Scheme to write a cryptocurrency node software, so "special C function" seems to overprivilege C...
> > Jets are "special" in so far as they are costed differently at the
> > consensus level than the equivalent pure/jetless simplicity code that
> > they replace. Whether they're written in C or something else isn't the
> > important part.
> > By comparison, generating lisp code with lisp code in chia doesn't get
> > special treatment.
> Hmm, what exactly do you mean here?
This is going a bit into the weeds...
> If I have a shorter piece of code that expands to a larger piece of code because metaprogramming, is it considered the same cost as the larger piece of code (even if not all parts of the larger piece of code are executed, e.g. branches)?
Chia looks at the problem differently to bitcoin. In bitcoin each
transaction includes a set of inputs, and each of those inputs contains
both a reference to a utxo which has a scriptPubKey, and a solution for
the scriptPubKey called the scriptSig. In chia, each block contains a
list of coins (~utxos) that are being spent, each of which has a hash
of its puzzle (~scriptPubKey) which must be solved; each block then
contains a lisp program that will produce all the transaction info,
namely coin (~utxo id), puzzle reveal (~witness program) and solution
(~witness stack); then to verify the block, you need to check the coins
exist, the puzzle reveals all match the corresponding coin's puzzle,
that the puzzle+solution executes successfully, and that the assertions
that get returned by all the puzzle+solutions are all consistent.
> Or is the cost simply proportional to the number of operations actually executed?
AIUI, the cost is the sum of the size of the program, as well as how
much compute and memory is used to run the program.
In comparison, the cost for an input with tapscript is the size of that
input; memory usage has a fixed maximum (1000 elements in the
stack/altstack, and 520 bytes per element); and compute resources are
limited according to the size of the input.
> It seems to me that lisp-generating-lisp compression would reduce the cost of bytes transmitted, but increase the CPU load (first the metaprogram runs, and *then* the produced program runs).
In chia, you're always running the metaprogram, it may just be that that
program is the equivalent of:
stuff = lambda: [("hello", "world"), ("hello", "Z-man")]
which doesn't seem much better than just saying:
stuff = [("hello", "world"), ("hello", "Z-man")]
The advantage is that you could construct a block template optimiser
that rewrites the program to:
def stuff():
h = "hello"
return [(h, "world"), (h, "Z-man")]
which for large values of "hello" may be worthwhile (and the standard
puzzle in chia is large enough at that that might well be worthwhile at
~227 bytes, since it implements taproot/graftroot logic from scratch).
> Over in that thread, we seem to have largely split jets into two types:
> * Consensus-critical jets which need a softfork but reduce the weight of the jetted code (and which are invisible to pre-softfork nodes).
> * Non-consensus-critical jets which only need relay change and reduces bytes sent, but keeps the weight of the jetted code.
> It seems to me that lisp-generating-lisp compression would roughly fall into the "non-consensus-critical jets", roughly.
It could do; but the way it's used in chia is consensus-critical.
I'm not 100% sure how costing works in chia, but I believe a block
template optimiser as above might allow miners to fit more transactions
in their block and therefore collect more transaction fees. That makes
the block packing problem harder though, since it means your transaction
is "cheaper" if it's more similar to other transactions in the block. I
don't think it's relevant today since fees seem to mostly be less than 1%
of the block reward...
The ability to reference prior blocks might mitigate that; but that
depends on how those back references are costed, which is all way beyond
my knowledge.
> > On Wed, Mar 09, 2022 at 02:30:34PM +0000, ZmnSCPxj via bitcoin-dev wrote:
> Not being a mathist, I have absolutely no idea, but: at least as I understood from the original mimblewimble.txt from Voldemort, BLS signatures had an additional assumption, which I *think* means "theoretically less secure than SECP256K1 Schnorr / ECDSA".
> Is my understanding correct?
> And if so, how theoretical would that be?
Like everything else in crypto, it's completely theoretical until it
starts becoming practical?
> PTLC signatures have the very nice property of being indistinguishable from non-PTLC signatures to anyone without the adaptor, and I think privacy-by-default should be what we encourage.
In bitcoin, you have a ~64B signature in every input, and hiding
a 32B secret in each of those is quite feasible if they're schnorr
signatures. When the block is published, 1000 different people can look
at 1000 different signatures, and discover the 1000 different secrets
they wanted to know.
In chia, every signature in the block is aggregated, so there is only a
single ~96B signature in each block, and there's no way to hide 32kB
worth of secret information in there. I'm not sure of the maths, but I
think your options in chia and their costs would be roughly:
* normal tx with just agg signature, no lightning secrets = 1,200,000
* aggregated signature + hash preimage = 1,200,300 (HTLC)
* aggregated signature + point d.log = 2,526,946 (PTLC visible)
* manual disaggregated signature = 2,772,020 (PTLC hidden)
But your lightning preimage reveal doesn't look like a normal chia
transaction in any case.
(Because chia's BLS12-381 curve differs from bitcoin's secp256k1,
it's not even possible to reveal a secp256k1 PTLC preimage on chia, so
you couldn't share a single PTLC-based lightning networks even if you
solved the exchange rate problem. Well, I guess you could theoretically
implement secp256k1 maths from scratch in chia lisp...)
> <rant>
> DSLs?
> Domain-specific languages?
> Do you know how many people hate autoconf?
Uh, that seems like the sort of thing you type up then delete before
sending...
> This seems to me to be not much different from adding a separate
> compiler, which translates from the surface language to the underlying
> opcode/lisp language,
No, what I meant was the lisp/opcode language is the DSL.
Though that said, there is a difference between chia lisp with macros
and clvm code; p2_delegated_puzzle with macros looks like:
(mod
(public_key delegated_puzzle delegated_puzzle_solution)
(include condition_codes.clvm)
;; hash a tree
;; This is used to calculate a puzzle hash given a puzzle program.
(defun sha256tree1
(TREE)
(if (l TREE)
(sha256 2 (sha256tree1 (f TREE)) (sha256tree1 (r TREE)))
(sha256 1 TREE)
)
)
(c (list AGG_SIG_ME public_key (sha256tree1 delegated_puzzle))
(a delegated_puzzle delegated_puzzle_solution))
)
but as clvm code, it looks like:
(a (q 4 (c 4 (c 5 (c (a 6 (c 2 (c 11 ()))) ()))) (a 11 23)) (c (q 50 2 (i (l 5) (q 11 (q . 2) (a 6 (c 2 (c 9 ()))) (a 6 (c 2 (c 13 ())))) (q 11 (q . 1) 5)) 1) 1))
I don't think you want to include code comments in the blockchain though,
so at some level I guess "compiling" is unavoidable.
Cheers,
aj
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-08 1:27 ` Anthony Towns
2022-03-08 3:06 ` ZmnSCPxj
@ 2022-03-09 2:54 ` Bram Cohen
2022-03-10 6:47 ` Anthony Towns
1 sibling, 1 reply; 23+ messages in thread
From: Bram Cohen @ 2022-03-09 2:54 UTC (permalink / raw)
To: Anthony Towns; +Cc: Bitcoin Protocol Discussion
[-- Attachment #1: Type: text/plain, Size: 6218 bytes --]
On Mon, Mar 7, 2022 at 5:27 PM Anthony Towns <aj@erisian.com.au> wrote:
> One way to match the way bitcoin do things, you could have the "list of
> extra conditions" encoded explicitly in the transaction via the annex,
> and then check the extra conditions when the script is executed.
>
The conditions are already basically what's in transactions. I think the
only thing missing is the assertion about one's own id, which could be
added in by, in addition to passing the scriptpubkey the transaction it's
part of, also passing in the index of inputs which it itself is.
>
> > If you're doing everything from scratch it's cleaner to go with the coin
> > set model, but retrofitting onto existing Bitcoin it may be best to leave
> > the UTXO model intact and compensate by adding a bunch more opcodes which
> > are special to parsing Bitcoin transactions. The transaction format
> itself
> > can be mostly left alone but to enable some of the extra tricks (mostly
> > implementing capabilities) it's probably a good idea to make new
> > conventions for how a transaction can have advisory information which
> > specifies which of the inputs to a transaction is the parent of a
> specific
> > output and also info which is used for communication between the UTXOs
> in a
> > transaction.
>
> I think the parent/child coin relationship is only interesting when
> "unrelated" spends can assert that the child coin is being created -- ie
> things along the lines of the "transaction sponsorship" proposal. My
> feeling is that complicates the mempool a bit much, so is best left for
> later, if done at all.
>
The parent/child relationship is mostly about implementing capabilities.
There's this fundamental trick, sort of the ollie of UTXO programming,
where to make a coin have a capability you have a wrapper around an inner
puzzle for it where the wrapper asserts 'my parent must either be the
unique originator of this capability or also have this same wrapper' and it
enforces that by being given a reveal of its parent and told/asserting its
own id which it can derive from that parent. The main benefit of the coin
set approach over UTXO is that it reduces the amount of stuff to be
revealed and string mangling involved in the parentage check.
>
> (I think the hard part of managing the extra conditions is mostly
> in keeping it efficient to manage the mempool and construct the most
> profitable blocks/bundles, rather than where the data goes)
>
Not sure what you mean by this. Conditions map fairly closely with what's
in Bitcoin transactions and are designed so to be monotonic and so the
costs and fees are known up front. The only way two transactions can
conflict with each other is if they both try to spend the same coin.
> > They're radically different approaches and
> > it's hard to see how they mix. Everything in lisp is completely
> sandboxed,
> > and that functionality is important to a lot of things, and it's really
> > normal to be given a reveal of a scriptpubkey and be able to rely on your
> > parsing of it.
>
> The above prevents combining puzzles/solutions from multiple coin spends,
> but I don't think that's very attractive in bitcoin's context, the way
> it is for chia. I don't think it loses much else?
>
Making something lisp-based be a completely alternative script type would
also be my preferred approach.
> > A nice side benefit of sticking with the UTXO model is that the soft fork
> > hook can be that all unknown opcodes make the entire thing automatically
> > pass.
>
> I don't think that works well if you want to allow the spender (the
> puzzle solution) to be able to use opcodes introduced in a soft-fork
> (eg, for graftroot-like behaviour)?
>
This is already the approach to soft forking in Bitcoin script and I don't
see anything wrong with it. You shouldn't write scripts using previously
unrecognized opcodes until after they've been soft forked into having real
functionality, and if you do that and accidentally write an anyonecanspend
that's your own fault.
> Having third parties be able to link their spends to yours complicates
> mempool behaviour a fair bit (see the discussions on pinning wrt
> lightning txs -- and that's only with direct participants being able to
> link transactions).
Bitcoin already has that with spending of transaction outputs, and Chia's
mempool doesn't currently let transactions depend on other transactions in
the mempool. If you do have that sort of dependency, you should have to
smush both transactions together to make a single larger transaction and
make it have enough of a fee to replace the smaller one.
I'm pretty skeptical about having a database of large script snippets
> that will hopefully be reused in the future.
>
That database is just the list of old blocks which can be dredged up to
have code pulled out of them.
> In chia the "scriptPubKey" is the hash of a lisp program, and when you
> create a new coin, the "scriptPubKey" of the newly generated coin is
> also the output of a lisp program. So writing a quine gets you general
> recursive covenants in a pretty straight forward way, as I understand it.
>
Usually you don't quite write a quine because you can be passed in your own
code and assert your own id derived from it, but that's the basic idea. You
need to validate your own id anyway when you have a capability.
> Rather than a "transaction" containing "inputs/outputs", chia has spend
> bundles that spend and create coins; and spend bundles can be merged
> together, so that a block only has a single spend bundle. That spend
> bundle joins all the puzzles (the programs that, when hashed match
> the scriptPubKey) and solutions (scriptSigs) for the coins being spent
> together.
>
I often refer to spend bundles as 'transactions'. Hope that isn't too
confusing. They serve the same function in the mempool.
>
> I /think/ the compression hook would be to allow you to have the puzzles
> be (re)generated via another lisp program if that was more efficient
> than just listing them out.
It literally has a lisp program called the generator which returns the list
of puzzle reveals and transactions. The simplest version of that program is
to return a quoted list.
[-- Attachment #2: Type: text/html, Size: 8410 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-09 2:54 ` Bram Cohen
@ 2022-03-10 6:47 ` Anthony Towns
2022-03-16 6:45 ` Bram Cohen
0 siblings, 1 reply; 23+ messages in thread
From: Anthony Towns @ 2022-03-10 6:47 UTC (permalink / raw)
To: Bram Cohen, Bitcoin Protocol Discussion
On Tue, Mar 08, 2022 at 06:54:56PM -0800, Bram Cohen via bitcoin-dev wrote:
> On Mon, Mar 7, 2022 at 5:27 PM Anthony Towns <aj@erisian.com.au> wrote:
> > One way to match the way bitcoin do things, you could have the "list of
> > extra conditions" encoded explicitly in the transaction via the annex,
> > and then check the extra conditions when the script is executed.
> The conditions are already basically what's in transactions. I think the
> only thing missing is the assertion about one's own id, which could be
> added in by, in addition to passing the scriptpubkey the transaction it's
> part of, also passing in the index of inputs which it itself is.
To redo the singleton pattern in bitcoin's context, I think you'd have
to pass in both the full tx you're spending (to be able to get the
txid of its parent) and the full tx of its parent (to be able to get
the scriptPubKey that your utxo spent) which seems klunky but at least
possible (you'd be able to drop the witness data at least; without that
every tx would be including the entire history of the singleton).
> > > A nice side benefit of sticking with the UTXO model is that the soft fork
> > > hook can be that all unknown opcodes make the entire thing automatically
> > > pass.
> > I don't think that works well if you want to allow the spender (the
> > puzzle solution) to be able to use opcodes introduced in a soft-fork
> > (eg, for graftroot-like behaviour)?
> This is already the approach to soft forking in Bitcoin script and I don't
> see anything wrong with it.
It's fine in Bitcoin script, because the scriptPubKey already commits to
all the opcodes that can possibly be used for any particular output. With
a lisp approach, however, you could pass in additional code fragments
to execute. For example, where you currently say:
script: [pubkey] CHECKSIG
witness: [64B signature][0x83]
(where 0x83 is SINGLE|ANYONECANPAY) you might translate that to:
script: (checksig pubkey (bip342-txmsg 3) 2)
witness: signature 0x83
where "3" grabs the sighash byte, and "2" grabs the signature. But you
could also translate it to:
script: (checksig pubkey (sha256 3 (a 3)) 2)
witness: signature (bip342-txmsg 0x83)
where "a 3" takes "(bip342-txmsg 0x83)" then evaluates it, and (sha256
3 (a 3)) makes sure you've signed off on both how the message was
constructed as well as what the message was. The advantage there is that
the spender can then create their own signature hashes however they like;
even ones that hadn't been thought of when the output was created.
But what if we later softfork in a bip118-txmsg for quick and easy
ANYPREVOUT style-signatures, and want to use that instead of custom
lisp code? You can't just stick (softfork C (bip118-txmsg 0xc3)) into
the witness, because it will evaluate to nil and you won't be signing
anything. But you *could* change the script to something like:
script: (softfork C (q checksigverify pubkey (a 3) 2))
witness: signature (bip118-txmsg 0xc3)
But what happens if the witness instead has:
script: (softfork C (q checksigverify pubkey (a 3) 2))
witness: fake-signature (fakeopcode 0xff)
If softfork is just doing a best effort for whatever opcodes it knows
about, and otherwise succeeding, then it has to succeed, and your
script/output has become anyone-can-spend.
On the other hand, if you could tell the softfork op that you only wanted
ops up-to-and-including the 118 softfork, then it could reject fakeopcode
and fail the script, which I think gives the desirable behaviour.
Cheers,
aj
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [bitcoin-dev] bitcoin scripting and lisp
2022-03-10 6:47 ` Anthony Towns
@ 2022-03-16 6:45 ` Bram Cohen
0 siblings, 0 replies; 23+ messages in thread
From: Bram Cohen @ 2022-03-16 6:45 UTC (permalink / raw)
To: Anthony Towns; +Cc: Bitcoin Protocol Discussion
[-- Attachment #1: Type: text/plain, Size: 1308 bytes --]
On Wed, Mar 9, 2022 at 10:47 PM Anthony Towns <aj@erisian.com.au> wrote:
>
> To redo the singleton pattern in bitcoin's context, I think you'd have
> to pass in both the full tx you're spending (to be able to get the
> txid of its parent) and the full tx of its parent (to be able to get
> the scriptPubKey that your utxo spent) which seems klunky but at least
> possible (you'd be able to drop the witness data at least; without that
> every tx would be including the entire history of the singleton).
>
Yes that's the idea. Since the parent transaction is in the blockchain it
could be pulled in automatically without having to charge vbytes for it.
> If softfork is just doing a best effort for whatever opcodes it knows
> about, and otherwise succeeding, then it has to succeed, and your
> script/output has become anyone-can-spend.
>
That can be alleviated by when things call untrusted code they can wrap it
in a guard which can be the soft fork opcode itself.
>
> On the other hand, if you could tell the softfork op that you only wanted
> ops up-to-and-including the 118 softfork, then it could reject fakeopcode
> and fail the script, which I think gives the desirable behaviour.
>
A simple approach to versioning like that may be more expedient. Soft
forking in CLVM isn't implemented yet.
[-- Attachment #2: Type: text/html, Size: 1984 bytes --]
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2022-03-22 23:37 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-04 1:04 [bitcoin-dev] bitcoin scripting and lisp Anthony Towns
2022-03-04 23:10 ` ZmnSCPxj
[not found] ` <CAD5xwhiZx+dp46Gn23tQRKc5PgJHmaJ_HC-38VB5WdJjWVVc4g@mail.gmail.com>
2022-03-05 13:41 ` Jeremy Rubin
2022-03-05 20:10 ` Russell O'Connor
2022-03-05 23:20 ` ZmnSCPxj
2022-03-06 2:09 ` Russell O'Connor
[not found] <mailman.30513.1646355894.8511.bitcoin-dev@lists.linuxfoundation.org>
2022-03-07 6:26 ` Bram Cohen
2022-03-07 22:56 ` ZmnSCPxj
2022-03-09 2:24 ` Bram Cohen
2022-03-08 1:27 ` Anthony Towns
2022-03-08 3:06 ` ZmnSCPxj
2022-03-09 3:07 ` Bram Cohen
2022-03-09 14:30 ` ZmnSCPxj
2022-03-16 6:40 ` Bram Cohen
2022-03-16 15:09 ` ZmnSCPxj
2022-03-11 4:46 ` Anthony Towns
2022-03-16 6:52 ` Bram Cohen
2022-03-16 14:54 ` ZmnSCPxj
2022-03-19 17:34 ` Bram Cohen
2022-03-22 23:37 ` Anthony Towns
2022-03-09 2:54 ` Bram Cohen
2022-03-10 6:47 ` Anthony Towns
2022-03-16 6:45 ` Bram Cohen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox