From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 9EA77C000B for ; Wed, 23 Mar 2022 00:20:26 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 87237612CA for ; Wed, 23 Mar 2022 00:20:26 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org X-Spam-Flag: NO X-Spam-Score: -1.599 X-Spam-Level: X-Spam-Status: No, score=-1.599 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, FROM_LOCAL_NOVOWEL=0.5, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no Authentication-Results: smtp3.osuosl.org (amavisd-new); dkim=pass (2048-bit key) header.d=protonmail.com Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id f933cV6Rsn_E for ; Wed, 23 Mar 2022 00:20:24 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from mail-4318.protonmail.ch (mail-4318.protonmail.ch [185.70.43.18]) by smtp3.osuosl.org (Postfix) with ESMTPS id AEC8A612C9 for ; Wed, 23 Mar 2022 00:20:24 +0000 (UTC) Date: Wed, 23 Mar 2022 00:20:16 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail; t=1647994821; bh=cWIJocXY4r+uM+zAxR9U7IMafHVoaSXbQhHZvh2Dy0M=; h=Date:To:From:Cc:Reply-To:Subject:Message-ID:In-Reply-To: References:From:To:Cc:Date:Subject:Reply-To:Feedback-ID: Message-ID; b=zkS01Aqdoj2Lqp0NuusgBMSGBdtK3giVkHpOKLoCMKVs4UGd1rWCPdhgMWG2aDSYo dzZA/JDIWaEMwdUzDjMVKT3Y7CRcelxVZNRYFDvAOPqOHwtqUGJL0Oaw6SmppEI1MN 6Bab7T3OftlaG0Bk9T2oCdfk3SWD2Sh1VxlbQYkpWkv8tBrnfhxiZPDiuaWF0s2cVe qBjC3CyTOSA69x1Z8ROZ19KnBYXdMnYwdEqcB1XNJcdsIFa1NcX6kGVK9fc/84XfT1 kLfzT8Lc1+QGfjXZQuCvYbumaMTaaT85bKtX8elmqLywwdroyq+SjJP+YpIIKxDgmQ Kc3Fty0TSWWSQ== To: Anthony Towns From: ZmnSCPxj Reply-To: ZmnSCPxj Message-ID: <6z4zgwg-r_EKOmZKCC1KyCmSjkZBbzHOKXHiMQf6th4r_PHDbMuCqSQ366hz6LRhdX25YI6IElcr9bFOVsu78UUns-ZNIt-YPgMqEwyg9ZM=@protonmail.com> In-Reply-To: <20220322231104.GA11179@erisian.com.au> References: <20220322231104.GA11179@erisian.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: Bitcoin Protocol Discussion Subject: Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Mar 2022 00:20:26 -0000 Good morning aj, > On Tue, Mar 22, 2022 at 05:37:03AM +0000, ZmnSCPxj via bitcoin-dev wrote: > > > Subject: Beyond Jets: Microcode: Consensus-Critical Jets Without Softfo= rks > > (Have you considered applying a jit or some other compression algorithm > to your emails?) > > > Microcode For Bitcoin SCRIPT > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D > > > > I propose: > > > > - Define a generic, low-level language (the "RISC language"). > > This is pretty much what Simplicity does, if you optimise the low-level > language to minimise the number of primitives and maximise the ability > to apply tooling to reason about it, which seem like good things for a > RISC language to optimise. > > > - Define a mapping from a specific, high-level language to > > the above language (the microcode). > > > > - Allow users to sacrifice Bitcoins to define a new microcode. > > I think you're defining "the microcode" as the "mapping" here. Yes. > > This is pretty similar to the suggestion Bram Cohen was making a couple > of months ago: > > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-December/019= 722.html > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/0197= 73.html > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/0198= 03.html > > I believe this is done in chia via the block being able to > include-by-reference prior blocks' transaction generators: > > ] transactions_generator_ref_list: List[uint32]: A list of block heights = of previous generators referenced by this block's generator. > > - https://docs.chia.net/docs/05block-validation/block_format > > (That approach comes at the cost of not being able to do full validat= ion > if you're running a pruning node. The alternative is to effectively > introduce a parallel "utxo" set -- where you're mapping the "sacrific= ed" > BTC as the nValue and instead of just mapping it to a scriptPubKey fo= r > a later spend, you're permanently storing the definition of the new > CISC opcode) > > Yes, the latter is basically what microcode is. > > We can then support a "RISC" language that is composed of > > general instructions, such as arithmetic, SECP256K1 scalar > > and point math, bytevector concatenation, sha256 midstates, > > bytevector bit manipulation, transaction introspection, and > > so on. > > A language that includes instructions for each operation we can think > of isn't very "RISC"... More importantly it gets straight back to the > "we've got a new zk system / ECC curve / ... that we want to include, > let's do a softfork" problem you were trying to avoid in the first place. `libsecp256k1` can run on purely RISC machines like ARM, so saying that a "= RISC" set of opcodes cannot implement some arbitrary ECC curve, when the in= struction set does not directly support that ECC curve, seems incorrect. Any new zk system / ECC curve would have to be implementable in C++, so if = you have micro-operations that would be needed for it, such as XORing two m= ulti-byte vectors together, multiplying multi-byte precision numbers, etc.,= then any new zk system or ECC curve would be implementable in microcode. For that matter, you could re-write `libsecp256k1` there. > > Then, the user creates a new transaction where one of > > the outputs contains, say, 1.0 Bitcoins (exact required > > value TBD), > > Likely, the "fair" price would be the cost of introducing however many > additional bytes to the utxo set that it would take to represent your > microcode, and the cost it would take to run jit(your microcode script) > if that were a validation function. Both seem pretty hard to manage. > > "Ideally", I think you'd want to be able to say "this old microcode > no longer has any value, let's forget it, and instead replace it with > this new microcode that is much better" -- that way nodes don't have to > keep around old useless data, and you've reduced the cost of introducing > new functionality. Yes, but that invites "I accidentally the smart contract" behavior. > Additionally, I think it has something of a tragedy-of-the-commons > problem: whoever creates the microcode pays the cost, but then anyone > can use it and gain the benefit. That might even end up creating > centralisation pressure: if you design a highly decentralised L2 system, > it ends up expensive because people can't coordinate to pay for the > new microcode that would make it cheaper; but if you design a highly > centralised L2 system, you can just pay for the microcode yourself and > make it even cheaper. The same "tragedy of the commons" applies to FOSS. "whoever creates the FOSS pays the cost, but then anyone can use it and gai= n the benefit" This seems like an argument against releasing a FOSS node software. Remember, microcode is software too, and copying software does not have a t= ragedy of the commons --- the main point of a tragedy of the commons is tha= t the commons is *degraded* by the use but nobody has incentive to maintain= against the degradation. But using software does not degrade the software, if I give you a copy of m= y software then I do not lose my software, which is why FOSS works. In order to make a highly-decentralized L2, you need to cooperate with tota= l strangers, possibly completely anonymously, in handling your money. I imagine that the level of cooperation needed in, say, Lightning network, = would be far above what is necessary to gather funds from multiple people w= ho want a particular microcode to happen until enough funds have been gathe= red to make the microcode happen. For example, create a fresh address for an amount you, personally, are will= ing to contribute in order to make the microcode happen. (If you are willing to spend the time and energy arguing on bitcoin-dev, th= en you are willing to contribute, even if others get the benefit in additio= n to yourself, and that time and energy has a corresponding Bitcoin value) Then spend it using a `SIGHASH_ANYONECANPAY | SIGHASH_SINGLE`, with the mic= rocode introduction outpoint as the single output you are signing. Gather enough such signatures from a community around a decentralized L2, a= nd you can achieve the necessary total funds for the microcode to happen. > This approach isn't very composable -- if there's a clever opcode > defined in one microcode spec, and another one in some other microcode, > the only way to use both of them in the same transaction is to burn 1 > BTC to define a new microcode that includes both of them. Yes, that is indeed a problem. > > We want to be able to execute the defined microcode > > faster than expanding an `OP_`-code SCRIPT to a > > `UOP_`-code SCRIPT and having an interpreter loop > > over the `UOP_`-code SCRIPT. > > We can use LLVM. > > We've not long ago gone to the effort of removing openssl as a consensus > critical dependency; and likewise previously removed bdb. Introducing a > huge new dependency to the definition of consensus seems like an enormous > step backwards. > > This would also mean we'd be stuck at the performance of whatever version > of llvm we initially adopted, as any performance improvements introduced > in later llvm versions would be a hard fork. Yes, LLVM is indeed the weak link in this idea. We could use NaCl instead, that has probably fewer issues /s. > > On the other hand, LLVM bugs are compiler bugs and > > the same bugs can hit the static compiler `cc`, too, > > "Well, you could hit Achilles in the heel, so really, what's the point > of trying to be invulnerable anywhere else?" Yes, LLVM is indeed the weak point here. We could just concatenate some C++ code together when a new microcode is in= troduced, and compile it statically, then store the resulting binary somewh= ere, and invoke it at the appropriate time to run validation. At least LLVM would be isolated into its own process in that case. > > Then we put a pointer to this compiled function to a > > 256-long array of functions, where the array index is > > the `OP_` code. > > That's a 256-long array of functions for each microcode, which increases > the "microcode-utxo" database storage size substantially. > > Presuming there are different jit targets (x86 vs arm?) it seems > difficulty to come up with a consistent interpretation of the cost for > these opcodes. > > I'm skeptical that a jit would be sufficient for increasing the > performance of an implementation just based on basic arithmetic opcodes > if we're talking about something like sha512 or bls12-381 or similar. Static compilation seems to work well enough --- and JIT vs static is a spe= ctrum, not either/or. The difference is really how much optimization you are willing to use. If microcodes are costly enough that they happen rarely, then using optimiz= ations that are often used only in static compilation, seems a reasonable t= radeoff > > Bugs in existing microcodes can be fixed by basing a > > new microcode from the existing microcode, and > > redefining the buggy implementation. > > Existing Tapscripts need to be re-spent to point to > > the new bugfixed microcode, but if you used the > > point-spend branch as an N-of-N of all participants > > you have an upgrade mechanism for free. > > It's not free if you have to do an on-chain spend... > > The "1 BTC" cost to fix the bug, and the extra storage in every node's > "utxo" set because they now have to keep both the buggy and fixed version= s > around permanently sure isn't free either. Heh, poor word choice. What I meant is that we do not need a separate upgrade mechanism, the desig= n work here is "free". *Using* the upgrade mechanism is costly and hence not "free". > If you're re-jitting every > microcode on startup, that could get pretty painful too. When LLVM is used in a static compiler, it writes the resulting code on-dis= k, I imagine the same mechanism can be used. > If you're proposing introducing byte vector manipulation and OP_CAT and > similar, which enables recursive covenants, then it might be good to > explain how this proposal addresses the concerns raised at the end of > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-March/020092= .html It does not, I am currently exploring and generating ideas, not particularl= y tying myself to one idea or another. Regards, ZmnSCPxj