* [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks @ 2022-03-22 5:37 ZmnSCPxj 2022-03-22 15:08 ` Russell O'Connor 2022-03-22 23:11 ` Anthony Towns 0 siblings, 2 replies; 8+ messages in thread From: ZmnSCPxj @ 2022-03-22 5:37 UTC (permalink / raw) To: bitcoin-dev Good morning list, It is entirely possible that I have gotten into the deep end and am now drowning in insanity, but here goes.... Subject: Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks Introduction ============ Recent (Early 2022) discussions on the bitcoin-dev mailing list have largely focused on new constructs that enable new functionality. One general idea can be summarized this way: * We should provide a very general language. * Then later, once we have learned how to use this language, we can softfork in new opcodes that compress sections of programs written in this general language. There are two arguments against this style: 1. One of the most powerful arguments the "general" side of the "general v specific" debate is that softforks are painful because people are going to keep reiterating the activation parameters debate in a memoryless process, so we want to keep the number of softforks low. * So, we should just provide a very general language and never softfork in any other change ever again. 2. One of the most powerful arguments the "general" side of the "general v specific" debate is that softforks are painful because people are going to keep reiterating the activation parameters debate in a memoryless process, so we want to keep the number of softforks low. * So, we should just skip over the initial very general language and individually activate small, specific constructs, reducing the needed softforks by one. By taking a page from microprocessor design, it seems to me that we can use the same above general idea (a general base language where we later "bless" some sequence of operations) while avoiding some of the arguments against it. Digression: Microcodes In CISC Microprocessors ---------------------------------------------- In the 1980s and 1990s, two competing microprocessor design paradigms arose: * Complex Instruction Set Computing (CISC) - Few registers, many addressing/indexing modes, variable instruction length, many obscure instructions. * Reduced Instruction Set Computing (RISC) - Many registers, usually only immediate and indexed addressing modes, fixed instruction length, few instructions. In CISC, the microprocessor provides very application-specific instructions, often with a small number of registers with specific uses. The instruction set was complicated, and often required multiple specific circuits for each application-specific instruction. Instructions had varying sizes and varying number of cycles. In RISC, the micrprocessor provides fewer instructions, and programmers (or compilers) are supposed to generate the code for all application-specific needs. The processor provided large register banks which could be used very generically and interchangeably. Instructions had the same size and every instruction took a fixed number of cycles. In CISC you usually had shorter code which could be written by human programmers in assembly language or machine language. In RISC, you generally had longer code, often difficult for human programmers to write, and you *needed* a compiler to generate it (unless you were very careful, or insane enough you could scroll over multiple pages of instructions without becoming more insane), or else you might forget about stuff like jump slots. For the most part, RISC lost, since most modern processors today are x86 or x86-64, an instruction set with varying instruction sizes, varying number of cycles per instruction, and complex instructions with application-specific uses. Or at least, it *looks like* RISC lost. In the 90s, Intel was struggling since their big beefy CISC designs were becoming too complicated. Bugs got past testing and into mass-produced silicon. RISC processors were beating the pants off 386s in terms of raw number of computations per second. RISC processors had the major advantage that they were inherently simpler, due to having fewer specific circuits and filling up their silicon with general-purpose registers (which are large but very simple circuits) to compensate. This meant that processor designers could fit more of the design in their merely human meat brains, and were less likely to make mistakes. The fixed number of cycles per instruction made it trivial to create a fixed-length pipeline for instruction processing, and practical RISC processors could deliver one instruction per clock cycle. Worse, the simplicity of RISC meant that smaller and less experienced teams could produce viable competitors to the Intel x86s. So what Intel did was to use a RISC processor, and add a special Instruction Decoder unit. The Instruction Decoder would take the CISC instruction stream accepted by classic Intel x86 processors, and emit RISC instructions for the internal RISC processor. CISC instructions might be variable length and have variable number of cycles, but the emitted RISC instructions were individually fixed length and fixed number of cycles. A CISC instruction might be equivalent to a single RISC instruction, or several. With this technique, Intel could deliver performance approaching their RISC-only competition, while retaining back-compatibility with existing software written for their classic CISC processors. At its core, the Instruction Decoder was a table-driven parser. This lookup table could be stored into on-chip flash memory. This had the advantage that the on-chip flash memory could be updated in case of bugs in the implementation of CISC instructions. This on-chip flash memory was then termed "microcode". Important advantages of this "microcode" technique were: * Back-compatibility with existing instruction sets. * Easier and more scalable underlying design due to ability to use RISC techniques while still supporting CISC instruction sets. * Possible to fix bugs in implementations of complex CISC instructions by uploading new microcode. (Obviously I have elided a bunch of stuff, but the above rough sketch should be sufficient as introduction.) Bitcoin Consensus Layer As Hardware ----------------------------------- While Bitcoin fullnode implementations are software, because of the need for consensus, this software is not actually very "soft". One can consider that, just as it would take a long time for new hardware to be designed with a changed instruction set, it is similarly taking a long time to change Bitcoin to support changed feature sets. Thus, we should really consider the Bitcoin consensus layer, and its SCRIPT, as hardware that other Bitcoin software and layers run on top of. This thus opens up the thought of using techniques that were useful in hardware design. Such as microcode: a translation layer from "old" instruction sets to "new" instruction sets, with the ability to modify this mapping. Microcode For Bitcoin SCRIPT ============================ I propose: * Define a generic, low-level language (the "RISC language"). * Define a mapping from a specific, high-level language to the above language (the microcode). * Allow users to sacrifice Bitcoins to define a new microcode. * Have users indicate the microcode they wish to use to interpret their Tapscripts. As a concrete example, let us consider the current Bitcoin SCRIPT as the "CISC" language. We can then support a "RISC" language that is composed of general instructions, such as arithmetic, SECP256K1 scalar and point math, bytevector concatenation, sha256 midstates, bytevector bit manipulation, transaction introspection, and so on. This "RISC" language would also be stack-based. As the "RISC" language would have more possible opcodes, we may need to use 2-byte opcodes for the "RISC" language instead of 1-byte opcodes. Let us call this "RISC" language the micro-opcode language. Then, the "microcode" simply maps the existing Bitcoin SCRIPT `OP_` codes to one or more `UOP_` micro-opcodes. An interesting fact is that stack-based languages have automatic referential transparency; that is, if I define some new word in a stack-based language and use that word, I can replace verbatim the text of the new word in that place without issue. Compare this to a language like C, where macro authors have to be very careful about inadvertent variable capture, wrapping `do { ... } while(0)` to avoid problems with `if` and multiple statements, multiple execution, and so on. Thus, a sequence of `OP_` opcodes can be mapped to a sequence of equivalent `UOP_` micro-opcodes without changing the interpretation of the source language, an important property when considering such a "compiled" language. We start with a default microcode which is equivalent to the current Bitcoin language. When users want to define a new microcode to implement new `OP_` codes or change existing `OP_` codes, they can refer to a "base" microcode, and only have to provide the new mappings. A microcode is fundamentally just a mapping from an `OP_` code to a variable-length sequence of `UOP_` micro-opcodes. ```Haskell import Data.Map -- type Opcode -- type UOpcode newtype Microcode = Microcode (Map.Map Opcode [UOpcode]) ``` Semantically, the SCRIPT interpreter processes `UOP_` micro-opcodes. ```Haskell -- instance Monad Interpreter -- can `fail`. interpreter :: Transaction -> TxInput -> [UOpcode] -> Interpreter () ``` Example ------- Suppose a user wants to re-enable `OP_CAT`, and nothing else. That user creates a microcode, referring to the current default Bitcoin SCRIPT microcode as the "base". The base microcode defines `OP_CAT` as equal to the sequence `UOP_FAIL` i.e. a micro-opcode that always fails. However, the new microcode will instead redefine the `OP_CAT` as the micro-opcode sequence `UOP_CAT`. Microcodes then have a standard way of being represented as a byte sequence. The user serializes their new microcode as a byte sequence. Then, the user creates a new transaction where one of the outputs contains, say, 1.0 Bitcoins (exact required value TBD), and has the `scriptPubKey` of `OP_TRUE OP_RETURN <serialized_microcode>`. This output is a "microcode introduction output", which is provably unspendable, thus burning the Bitcoins. (It need not be a single user, multiple users can coordinate by signing a single transaction that commits their funds to the microcode introduction.) Once the above transaction has been deeply confirmed, the user can then take the hash of the microcode serialization. Then the user can use a SCRIPT with `OP_CAT` enabled, by using a Tapscript with, say, version `0xce`, and with the SCRIPT having the microcode hash as its first bytes, followed by the `OP_` codes. Fullnodes will then process recognized microcode introduction outputs and store mappings from their hashes to the microcodes in a new microcodes index. Fullnodes can then process version-`0xce` Tapscripts by checking if the microcodes index has the indicated microcode hash. Semantically, fullnodes take the SCRIPT, and for each `OP_` code in it, expands it to a sequence of `UOP_` micro-opcodes, then concatenates each such sequence. Then, the SCRIPT interpreter operates over a sequence of `UOP_` micro-opcodes. Optimizing Microcodes --------------------- Suppose there is some new microcode that users have published onchain. We want to be able to execute the defined microcode faster than expanding an `OP_`-code SCRIPT to a `UOP_`-code SCRIPT and having an interpreter loop over the `UOP_`-code SCRIPT. We can use LLVM. WARNING: LLVM might not be appropriate for network-facing security-sensitive applications. In particular, LLVM bugs. especially nondeterminism bugs, can lead to consensus divergence and disastrous chainsplits! On the other hand, LLVM bugs are compiler bugs and the same bugs can hit the static compiler `cc`, too, since the same LLVM code runs in both JIT and static compilation, so this risk already exists for Bitcoin. (i.e. we already rely on LLVM not being buggy enough to trigger Bitcoin consensus divergence, else we would have written Bitcoin Core SCRIPT interpreter in assembly.) Each `UOP_`-code has an equivalent tree of LLVM code. For each `Opcode` in the microcode, we take its sequence of `UOpcode`s and expand them to this tree, concatenating the equivalent trees for each `UOpcode` in the sequence. Then we ask LLVM to JIT-compile this code to a new function, running LLVM-provided optimizers. Then we put a pointer to this compiled function to a 256-long array of functions, where the array index is the `OP_` code. The SCRIPT interpreter then simply iterates over the `OP_` code SCRIPT and calls each of the JIT-compiled functions. This reduces much of the overhead of the `UOP_` layer and makes it approach the current performance of the existing `OP_` interpreter. For the default Bitcoin SCRIPT, the opcodes array contains pointers to statically-compiled functions. A microcode that is based on the default Bitcoin SCRIPT copies this opcodes array, then overwrites the entries. Future versions of Bitcoin Core can "bless" particular microcodes by providing statically-compiled functions for those microcodes. This leads to even better performance (there is no need to recompile ancient onchain microcodes each time Bitcoin Core starts) without any consensus divergence. It is a pure optimization and does not imply a tightening of rules, and is thus not a softfork. (To reduce the chance of network faults being used to poke into `W|X` memory (since `W|X` memory is needed in order to actually JIT compile) we can isolate the SCRIPT interpreter into its own process separate from the network-facing code. This does imply additional overhead in serializing transactions we want to ask the SCRIPT interpreter to validate.) Comparison To Jets ------------------ This technique allows users to define "jets", i.e. sequences of low-level general operations that users have determined are common enough they should just be implemented as faster code that is executed directly by the underlying hardware processor rather than via a software interpreter. Basically, each redefined `OP_` code is a jet of a sequence of `UOP_` micro-opcodes. We implement this by dynamically JIT-compiling the proposed jets, as described above. SCRIPTs using jetted code remain smaller, as the jet definition is done in a previous transaction and does not require copy-pasta (Do Not Repeat Yourself!). At the same time, jettification is not tied to developers, thus removing the need to keep softforking new features --- we only need define a sufficiently general language and then we can implement pretty much anything worth implementing (and a bunch of other things that should not be implemented, but hey, users gonna use...). Bugs in existing microcodes can be fixed by basing a new microcode from the existing microcode, and redefining the buggy implementation. Existing Tapscripts need to be re-spent to point to the new bugfixed microcode, but if you used the point-spend branch as an N-of-N of all participants you have an upgrade mechanism for free. In order to ensure that the JIT-compilation of new microcodes is not triggered trivially, we require that users petitioning for the jettification of some operations (i.e. introducing a new microcode) must sacrifice Bitcoins. Burning Bitcoins is better than increasing the weight of microcode introduction outputs; all fullnodes are affected by the need to JIT-compile the new microcode, so they benefit from the reduction in supply, thus getting compensated for the work of JIT-compiling the new microcode. Ohter mechanisms for making microcode introduction outputs expensive are also possible. Nothing really requires that we use a stack-based language for this; any sufficiently FP language should allow referential transparency. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks 2022-03-22 5:37 [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks ZmnSCPxj @ 2022-03-22 15:08 ` Russell O'Connor 2022-03-22 16:22 ` ZmnSCPxj 2022-03-22 23:11 ` Anthony Towns 1 sibling, 1 reply; 8+ messages in thread From: Russell O'Connor @ 2022-03-22 15:08 UTC (permalink / raw) To: ZmnSCPxj, Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 18815 bytes --] Setting aside my thoughts that something like Simplicity would make a better platform than Bitcoin Script (due to expression operating on a more narrow interface than the entire stack (I'm looking at you OP_DEPTH)) there is an issue with namespace management. If I understand correctly, your implication was that once opcodes are redefined by an OP_RETURN transaction, subsequent transactions of that opcode refer to the new microtransaction. But then we have a race condition between people submitting transactions expecting the outputs to refer to the old code and having their code redefined by the time they do get confirmed (or worse having them reorged). I've partially addressed this issue in my Simplicity design where the commitment of a Simplicity program in a scriptpubkey covers the hash of the specification of the jets used, which makes commits unambiguously to the semantics (rightly or wrongly). But the issue resurfaces at redemption time where I (currently) have a consensus critical map of codes to jets that is used to decode the witness data into a Simplicity program. If one were to allow this map of codes to jets to be replaced (rather than just extended) then it would cause redemption to fail, because the hash of the new jets would no longer match the hash of the jets appearing the the input's scriptpubkey commitment. While this is still not good and I don't recommend it, it is probably better than letting the semantics of your programs be changed out from under you. This comment is not meant as an endorsement of ths idea, which is a little bit out there, at least as far as Bitcoin is concerned. :) My long term plans are to move this consensus critical map of codes out of the consensus layer and into the p2p layer where peers can negotiate their own encodings between each other. But that plan is also a little bit out there, and it still doesn't solve the issue of how to weight reused jets, where weight is still consensus critical. On Tue, Mar 22, 2022 at 1:37 AM ZmnSCPxj via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > Good morning list, > > It is entirely possible that I have gotten into the deep end and am now > drowning in insanity, but here goes.... > > Subject: Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks > > Introduction > ============ > > Recent (Early 2022) discussions on the bitcoin-dev mailing > list have largely focused on new constructs that enable new > functionality. > > One general idea can be summarized this way: > > * We should provide a very general language. > * Then later, once we have learned how to use this language, > we can softfork in new opcodes that compress sections of > programs written in this general language. > > There are two arguments against this style: > > 1. One of the most powerful arguments the "general" side of > the "general v specific" debate is that softforks are > painful because people are going to keep reiterating the > activation parameters debate in a memoryless process, so > we want to keep the number of softforks low. > * So, we should just provide a very general language and > never softfork in any other change ever again. > 2. One of the most powerful arguments the "general" side of > the "general v specific" debate is that softforks are > painful because people are going to keep reiterating the > activation parameters debate in a memoryless process, so > we want to keep the number of softforks low. > * So, we should just skip over the initial very general > language and individually activate small, specific > constructs, reducing the needed softforks by one. > > By taking a page from microprocessor design, it seems to me > that we can use the same above general idea (a general base > language where we later "bless" some sequence of operations) > while avoiding some of the arguments against it. > > Digression: Microcodes In CISC Microprocessors > ---------------------------------------------- > > In the 1980s and 1990s, two competing microprocessor design > paradigms arose: > > * Complex Instruction Set Computing (CISC) > - Few registers, many addressing/indexing modes, variable > instruction length, many obscure instructions. > * Reduced Instruction Set Computing (RISC) > - Many registers, usually only immediate and indexed > addressing modes, fixed instruction length, few > instructions. > > In CISC, the microprocessor provides very application-specific > instructions, often with a small number of registers with > specific uses. > The instruction set was complicated, and often required > multiple specific circuits for each application-specific > instruction. > Instructions had varying sizes and varying number of cycles. > > In RISC, the micrprocessor provides fewer instructions, and > programmers (or compilers) are supposed to generate the code > for all application-specific needs. > The processor provided large register banks which could be > used very generically and interchangeably. > Instructions had the same size and every instruction took a > fixed number of cycles. > > In CISC you usually had shorter code which could be written > by human programmers in assembly language or machine language. > In RISC, you generally had longer code, often difficult for > human programmers to write, and you *needed* a compiler to > generate it (unless you were very careful, or insane enough > you could scroll over multiple pages of instructions without > becoming more insane), or else you might forget about stuff > like jump slots. > > For the most part, RISC lost, since most modern processors > today are x86 or x86-64, an instruction set with varying > instruction sizes, varying number of cycles per instruction, > and complex instructions with application-specific uses. > > Or at least, it *looks like* RISC lost. > In the 90s, Intel was struggling since their big beefy CISC > designs were becoming too complicated. > Bugs got past testing and into mass-produced silicon. > RISC processors were beating the pants off 386s in terms of > raw number of computations per second. > > RISC processors had the major advantage that they were > inherently simpler, due to having fewer specific circuits > and filling up their silicon with general-purpose registers > (which are large but very simple circuits) to compensate. > This meant that processor designers could fit more of the > design in their merely human meat brains, and were less > likely to make mistakes. > The fixed number of cycles per instruction made it trivial > to create a fixed-length pipeline for instruction processing, > and practical RISC processors could deliver one instruction > per clock cycle. > Worse, the simplicity of RISC meant that smaller and less > experienced teams could produce viable competitors to the > Intel x86s. > > So what Intel did was to use a RISC processor, and add a > special Instruction Decoder unit. > The Instruction Decoder would take the CISC instruction > stream accepted by classic Intel x86 processors, and emit > RISC instructions for the internal RISC processor. > CISC instructions might be variable length and have variable > number of cycles, but the emitted RISC instructions were > individually fixed length and fixed number of cycles. > A CISC instruction might be equivalent to a single RISC > instruction, or several. > > With this technique, Intel could deliver performance > approaching their RISC-only competition, while retaining > back-compatibility with existing software written for their > classic CISC processors. > > At its core, the Instruction Decoder was a table-driven > parser. > This lookup table could be stored into on-chip flash memory. > This had the advantage that the on-chip flash memory could be > updated in case of bugs in the implementation of CISC > instructions. > This on-chip flash memory was then termed "microcode". > > Important advantages of this "microcode" technique were: > > * Back-compatibility with existing instruction sets. > * Easier and more scalable underlying design due to ability > to use RISC techniques while still supporting CISC instruction > sets. > * Possible to fix bugs in implementations of complex CISC > instructions by uploading new microcode. > > (Obviously I have elided a bunch of stuff, but the above > rough sketch should be sufficient as introduction.) > > Bitcoin Consensus Layer As Hardware > ----------------------------------- > > While Bitcoin fullnode implementations are software, because > of the need for consensus, this software is not actually very > "soft". > One can consider that, just as it would take a long time for > new hardware to be designed with a changed instruction set, > it is similarly taking a long time to change Bitcoin to > support changed feature sets. > > Thus, we should really consider the Bitcoin consensus layer, > and its SCRIPT, as hardware that other Bitcoin software and > layers run on top of. > > This thus opens up the thought of using techniques that were > useful in hardware design. > Such as microcode: a translation layer from "old" instruction > sets to "new" instruction sets, with the ability to modify this > mapping. > > Microcode For Bitcoin SCRIPT > ============================ > > I propose: > > * Define a generic, low-level language (the "RISC language"). > * Define a mapping from a specific, high-level language to > the above language (the microcode). > * Allow users to sacrifice Bitcoins to define a new microcode. > * Have users indicate the microcode they wish to use to > interpret their Tapscripts. > > As a concrete example, let us consider the current Bitcoin > SCRIPT as the "CISC" language. > > We can then support a "RISC" language that is composed of > general instructions, such as arithmetic, SECP256K1 scalar > and point math, bytevector concatenation, sha256 midstates, > bytevector bit manipulation, transaction introspection, and > so on. > This "RISC" language would also be stack-based. > As the "RISC" language would have more possible opcodes, > we may need to use 2-byte opcodes for the "RISC" language > instead of 1-byte opcodes. > Let us call this "RISC" language the micro-opcode language. > > Then, the "microcode" simply maps the existing Bitcoin > SCRIPT `OP_` codes to one or more `UOP_` micro-opcodes. > > An interesting fact is that stack-based languages have > automatic referential transparency; that is, if I define > some new word in a stack-based language and use that word, > I can replace verbatim the text of the new word in that > place without issue. > Compare this to a language like C, where macro authors > have to be very careful about inadvertent variable > capture, wrapping `do { ... } while(0)` to avoid problems > with `if` and multiple statements, multiple execution, and > so on. > > Thus, a sequence of `OP_` opcodes can be mapped to a > sequence of equivalent `UOP_` micro-opcodes without > changing the interpretation of the source language, an > important property when considering such a "compiled" > language. > > We start with a default microcode which is equivalent > to the current Bitcoin language. > When users want to define a new microcode to implement > new `OP_` codes or change existing `OP_` codes, they > can refer to a "base" microcode, and only have to > provide the new mappings. > > A microcode is fundamentally just a mapping from an > `OP_` code to a variable-length sequence of `UOP_` > micro-opcodes. > > ```Haskell > import Data.Map > -- type Opcode > -- type UOpcode > newtype Microcode = Microcode (Map.Map Opcode [UOpcode]) > ``` > > Semantically, the SCRIPT interpreter processes `UOP_` > micro-opcodes. > > ```Haskell > -- instance Monad Interpreter -- can `fail`. > interpreter :: Transaction -> TxInput -> [UOpcode] -> Interpreter () > ``` > > Example > ------- > > Suppose a user wants to re-enable `OP_CAT`, and nothing > else. > > That user creates a microcode, referring to the current > default Bitcoin SCRIPT microcode as the "base". > The base microcode defines `OP_CAT` as equal to the > sequence `UOP_FAIL` i.e. a micro-opcode that always fails. > However, the new microcode will instead redefine the > `OP_CAT` as the micro-opcode sequence `UOP_CAT`. > > Microcodes then have a standard way of being represented > as a byte sequence. > The user serializes their new microcode as a byte > sequence. > > Then, the user creates a new transaction where one of > the outputs contains, say, 1.0 Bitcoins (exact required > value TBD), and has the `scriptPubKey` of > `OP_TRUE OP_RETURN <serialized_microcode>`. > This output is a "microcode introduction output", which > is provably unspendable, thus burning the Bitcoins. > > (It need not be a single user, multiple users can > coordinate by signing a single transaction that commits > their funds to the microcode introduction.) > > Once the above transaction has been deeply confirmed, > the user can then take the hash of the microcode > serialization. > Then the user can use a SCRIPT with `OP_CAT` enabled, > by using a Tapscript with, say, version `0xce`, and > with the SCRIPT having the microcode hash as its first > bytes, followed by the `OP_` codes. > > Fullnodes will then process recognized microcode > introduction outputs and store mappings from their > hashes to the microcodes in a new microcodes index. > Fullnodes can then process version-`0xce` Tapscripts > by checking if the microcodes index has the indicated > microcode hash. > > Semantically, fullnodes take the SCRIPT, and for each > `OP_` code in it, expands it to a sequence of `UOP_` > micro-opcodes, then concatenates each such sequence. > Then, the SCRIPT interpreter operates over a sequence > of `UOP_` micro-opcodes. > > Optimizing Microcodes > --------------------- > > Suppose there is some new microcode that users have > published onchain. > > We want to be able to execute the defined microcode > faster than expanding an `OP_`-code SCRIPT to a > `UOP_`-code SCRIPT and having an interpreter loop > over the `UOP_`-code SCRIPT. > > We can use LLVM. > > WARNING: LLVM might not be appropriate for > network-facing security-sensitive applications. > In particular, LLVM bugs. especially nondeterminism > bugs, can lead to consensus divergence and disastrous > chainsplits! > On the other hand, LLVM bugs are compiler bugs and > the same bugs can hit the static compiler `cc`, too, > since the same LLVM code runs in both JIT and static > compilation, so this risk already exists for Bitcoin. > (i.e. we already rely on LLVM not being buggy enough > to trigger Bitcoin consensus divergence, else we would > have written Bitcoin Core SCRIPT interpreter in > assembly.) > > Each `UOP_`-code has an equivalent tree of LLVM code. > For each `Opcode` in the microcode, we take its > sequence of `UOpcode`s and expand them to this tree, > concatenating the equivalent trees for each `UOpcode` > in the sequence. > Then we ask LLVM to JIT-compile this code to a new > function, running LLVM-provided optimizers. > Then we put a pointer to this compiled function to a > 256-long array of functions, where the array index is > the `OP_` code. > > The SCRIPT interpreter then simply iterates over the > `OP_` code SCRIPT and calls each of the JIT-compiled > functions. > This reduces much of the overhead of the `UOP_` layer > and makes it approach the current performance of the > existing `OP_` interpreter. > > For the default Bitcoin SCRIPT, the opcodes array > contains pointers to statically-compiled functions. > A microcode that is based on the default Bitcoin > SCRIPT copies this opcodes array, then overwrites > the entries. > > Future versions of Bitcoin Core can "bless" > particular microcodes by providing statically-compiled > functions for those microcodes. > This leads to even better performance (there is > no need to recompile ancient onchain microcodes each > time Bitcoin Core starts) without any consensus > divergence. > It is a pure optimization and does not imply a > tightening of rules, and is thus not a softfork. > > (To reduce the chance of network faults being used > to poke into `W|X` memory (since `W|X` memory is > needed in order to actually JIT compile) we can > isolate the SCRIPT interpreter into its own process > separate from the network-facing code. > This does imply additional overhead in serializing > transactions we want to ask the SCRIPT interpreter > to validate.) > > Comparison To Jets > ------------------ > > This technique allows users to define "jets", i.e. > sequences of low-level general operations that users > have determined are common enough they should just > be implemented as faster code that is executed > directly by the underlying hardware processor rather > than via a software interpreter. > Basically, each redefined `OP_` code is a jet of a > sequence of `UOP_` micro-opcodes. > > We implement this by dynamically JIT-compiling the > proposed jets, as described above. > SCRIPTs using jetted code remain smaller, as the > jet definition is done in a previous transaction and > does not require copy-pasta (Do Not Repeat Yourself!). > At the same time, jettification is not tied to > developers, thus removing the need to keep softforking > new features --- we only need define a sufficiently > general language and then we can implement pretty much > anything worth implementing (and a bunch of other things > that should not be implemented, but hey, users gonna > use...). > > Bugs in existing microcodes can be fixed by basing a > new microcode from the existing microcode, and > redefining the buggy implementation. > Existing Tapscripts need to be re-spent to point to > the new bugfixed microcode, but if you used the > point-spend branch as an N-of-N of all participants > you have an upgrade mechanism for free. > > In order to ensure that the JIT-compilation of new > microcodes is not triggered trivially, we require > that users petitioning for the jettification of some > operations (i.e. introducing a new microcode) must > sacrifice Bitcoins. > > Burning Bitcoins is better than increasing the weight > of microcode introduction outputs; all fullnodes are > affected by the need to JIT-compile the new microcode, > so they benefit from the reduction in supply, thus > getting compensated for the work of JIT-compiling the > new microcode. > Ohter mechanisms for making microcode introduction > outputs expensive are also possible. > > Nothing really requires that we use a stack-based > language for this; any sufficiently FP language > should allow referential transparency. > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > [-- Attachment #2: Type: text/html, Size: 21047 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks 2022-03-22 15:08 ` Russell O'Connor @ 2022-03-22 16:22 ` ZmnSCPxj 2022-03-22 16:28 ` Russell O'Connor 0 siblings, 1 reply; 8+ messages in thread From: ZmnSCPxj @ 2022-03-22 16:22 UTC (permalink / raw) To: Russell O'Connor; +Cc: Bitcoin Protocol Discussion Good morning Russell, > Setting aside my thoughts that something like Simplicity would make a better platform than Bitcoin Script (due to expression operating on a more narrow interface than the entire stack (I'm looking at you OP_DEPTH)) there is an issue with namespace management. > > If I understand correctly, your implication was that once opcodes are redefined by an OP_RETURN transaction, subsequent transactions of that opcode refer to the new microtransaction. But then we have a race condition between people submitting transactions expecting the outputs to refer to the old code and having their code redefined by the time they do get confirmed (or worse having them reorged). No, use of specific microcodes is opt-in: you have to use a specific `0xce` Tapscript version, ***and*** refer to the microcode you want to use via the hash of the microcode. The only race condition is reorging out a newly-defined microcode. This can be avoided by waiting for deep confirmation of a newly-defined microcode before actually using it. But once the microcode introduction outpoint of a particular microcode has been deeply confirmed, then your Tapscript can refer to the microcode, and its meaning does not change. Fullnodes may need to maintain multiple microcodes, which is why creating new microcodes is expensive; they not only require JIT compilation, they also require that fullnodes keep an index that cannot have items deleted. The advantage of the microcode scheme is that the size of the SCRIPT can be used as a proxy for CPU load ---- just as it is done for current Bitcoin SCRIPT. As long as the number of `UOP_` micro-opcodes that an `OP_` code can expand to is bounded, and we avoid looping constructs, then the CPU load is also bounded and the size of the SCRIPT approximates the amount of processing needed, thus microcode does not require a softfork to modify weight calculations in the future. Regards, ZmnSCPxj ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks 2022-03-22 16:22 ` ZmnSCPxj @ 2022-03-22 16:28 ` Russell O'Connor 2022-03-22 16:39 ` ZmnSCPxj 0 siblings, 1 reply; 8+ messages in thread From: Russell O'Connor @ 2022-03-22 16:28 UTC (permalink / raw) To: ZmnSCPxj; +Cc: Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 2262 bytes --] Thanks for the clarification. You don't think referring to the microcode via its hash, effectively using 32-byte encoding of opcodes, is still rather long winded? On Tue, Mar 22, 2022 at 12:23 PM ZmnSCPxj <ZmnSCPxj@protonmail.com> wrote: > Good morning Russell, > > > Setting aside my thoughts that something like Simplicity would make a > better platform than Bitcoin Script (due to expression operating on a more > narrow interface than the entire stack (I'm looking at you OP_DEPTH)) there > is an issue with namespace management. > > > > If I understand correctly, your implication was that once opcodes are > redefined by an OP_RETURN transaction, subsequent transactions of that > opcode refer to the new microtransaction. But then we have a race > condition between people submitting transactions expecting the outputs to > refer to the old code and having their code redefined by the time they do > get confirmed (or worse having them reorged). > > No, use of specific microcodes is opt-in: you have to use a specific > `0xce` Tapscript version, ***and*** refer to the microcode you want to use > via the hash of the microcode. > > The only race condition is reorging out a newly-defined microcode. > This can be avoided by waiting for deep confirmation of a newly-defined > microcode before actually using it. > > But once the microcode introduction outpoint of a particular microcode has > been deeply confirmed, then your Tapscript can refer to the microcode, and > its meaning does not change. > > Fullnodes may need to maintain multiple microcodes, which is why creating > new microcodes is expensive; they not only require JIT compilation, they > also require that fullnodes keep an index that cannot have items deleted. > > > The advantage of the microcode scheme is that the size of the SCRIPT can > be used as a proxy for CPU load ---- just as it is done for current Bitcoin > SCRIPT. > As long as the number of `UOP_` micro-opcodes that an `OP_` code can > expand to is bounded, and we avoid looping constructs, then the CPU load is > also bounded and the size of the SCRIPT approximates the amount of > processing needed, thus microcode does not require a softfork to modify > weight calculations in the future. > > Regards, > ZmnSCPxj > [-- Attachment #2: Type: text/html, Size: 2641 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks 2022-03-22 16:28 ` Russell O'Connor @ 2022-03-22 16:39 ` ZmnSCPxj 2022-03-22 16:47 ` ZmnSCPxj 0 siblings, 1 reply; 8+ messages in thread From: ZmnSCPxj @ 2022-03-22 16:39 UTC (permalink / raw) To: Russell O'Connor; +Cc: Bitcoin Protocol Discussion Good morning Russell, > Thanks for the clarification. > > You don't think referring to the microcode via its hash, effectively using 32-byte encoding of opcodes, is still rather long winded? A microcode is a *mapping* of `OP_` codes to a variable-length sequence of `UOP_` micro-opcodes. So a microcode hash refers to an entire language of redefined `OP_` codes, not each individual opcode in the language. If it costs 1 Bitcoin to create a new microcode, then there are only 21 million possible microcodes, and I think about 50 bits of hash is sufficient to specify those with low probability of collision. We could use a 20-byte RIPEMD . SHA256 instead for 160 bits, that should be more than sufficient with enough margin. Though perhaps it is now easier to deliberately attack... Also, if you have a common SCRIPT whose non-`OP_PUSH` opcodes are more than say 32 + 1 bytes (or 20 + 1 if using RIPEMD), and you can fit their equivalent `UOP_` codes into the max limit for a *single* opcode, you can save bytes by redefining some random `OP_` code into the sequence of all the `UOP_` codes. You would have a hash reference to the microcode, and a single byte for the actual "SCRIPT" which is just a jet of the entire SCRIPT. Users of multiple *different* such SCRIPTs can band together to define a single microcode, mapping their SCRIPTs to different `OP_` codes and sharing the cost of defining the new microcode that shortens all their SCRIPTs. Regards, ZmnSCPxj ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks 2022-03-22 16:39 ` ZmnSCPxj @ 2022-03-22 16:47 ` ZmnSCPxj 0 siblings, 0 replies; 8+ messages in thread From: ZmnSCPxj @ 2022-03-22 16:47 UTC (permalink / raw) To: Russell O'Connor; +Cc: Bitcoin Protocol Discussion Good morning again Russell, > Good morning Russell, > > > Thanks for the clarification. > > You don't think referring to the microcode via its hash, effectively using 32-byte encoding of opcodes, is still rather long winded? For that matter, since an entire microcode represents a language (based on the current OG Bitcoin SCRIPT language), with a little more coordination, we could entirely replace Tapscript versions --- every Tapscript version is a slot for a microcode, and the current OG Bitcoin SCRIPT is just the one in slot `0xc2`. Filled slots cannot be changed, but new microcodes can use some currently-empty Tapscript version slot, and have it properly defined in a microcode introduction outpoint. Then indication of a microcode would take only one byte, that is already needed currently anyway. That does limit us to only 255 new microcodes, thus the cost of one microcode would have to be a good bit higher. Again, remember, microcodes represent an entire language that is an extension of OG Bitcoin SCRIPT, not individual operations in that language. Regards, ZmnSCPxj ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks 2022-03-22 5:37 [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks ZmnSCPxj 2022-03-22 15:08 ` Russell O'Connor @ 2022-03-22 23:11 ` Anthony Towns 2022-03-23 0:20 ` ZmnSCPxj 1 sibling, 1 reply; 8+ messages in thread From: Anthony Towns @ 2022-03-22 23:11 UTC (permalink / raw) To: ZmnSCPxj, Bitcoin Protocol Discussion On Tue, Mar 22, 2022 at 05:37:03AM +0000, ZmnSCPxj via bitcoin-dev wrote: > Subject: Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks (Have you considered applying a jit or some other compression algorithm to your emails?) > Microcode For Bitcoin SCRIPT > ============================ > I propose: > * Define a generic, low-level language (the "RISC language"). This is pretty much what Simplicity does, if you optimise the low-level language to minimise the number of primitives and maximise the ability to apply tooling to reason about it, which seem like good things for a RISC language to optimise. > * Define a mapping from a specific, high-level language to > the above language (the microcode). > * Allow users to sacrifice Bitcoins to define a new microcode. I think you're defining "the microcode" as the "mapping" here. This is pretty similar to the suggestion Bram Cohen was making a couple of months ago: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-December/019722.html https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019773.html https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019803.html I believe this is done in chia via the block being able to include-by-reference prior blocks' transaction generators: ] transactions_generator_ref_list: List[uint32]: A list of block heights of previous generators referenced by this block's generator. - https://docs.chia.net/docs/05block-validation/block_format (That approach comes at the cost of not being able to do full validation if you're running a pruning node. The alternative is to effectively introduce a parallel "utxo" set -- where you're mapping the "sacrificed" BTC as the nValue and instead of just mapping it to a scriptPubKey for a later spend, you're permanently storing the definition of the new CISC opcode) > We can then support a "RISC" language that is composed of > general instructions, such as arithmetic, SECP256K1 scalar > and point math, bytevector concatenation, sha256 midstates, > bytevector bit manipulation, transaction introspection, and > so on. A language that includes instructions for each operation we can think of isn't very "RISC"... More importantly it gets straight back to the "we've got a new zk system / ECC curve / ... that we want to include, let's do a softfork" problem you were trying to avoid in the first place. > Then, the user creates a new transaction where one of > the outputs contains, say, 1.0 Bitcoins (exact required > value TBD), Likely, the "fair" price would be the cost of introducing however many additional bytes to the utxo set that it would take to represent your microcode, and the cost it would take to run jit(your microcode script) if that were a validation function. Both seem pretty hard to manage. "Ideally", I think you'd want to be able to say "this old microcode no longer has any value, let's forget it, and instead replace it with this new microcode that is much better" -- that way nodes don't have to keep around old useless data, and you've reduced the cost of introducing new functionality. Additionally, I think it has something of a tragedy-of-the-commons problem: whoever creates the microcode pays the cost, but then anyone can use it and gain the benefit. That might even end up creating centralisation pressure: if you design a highly decentralised L2 system, it ends up expensive because people can't coordinate to pay for the new microcode that would make it cheaper; but if you design a highly centralised L2 system, you can just pay for the microcode yourself and make it even cheaper. This approach isn't very composable -- if there's a clever opcode defined in one microcode spec, and another one in some other microcode, the only way to use both of them in the same transaction is to burn 1 BTC to define a new microcode that includes both of them. > We want to be able to execute the defined microcode > faster than expanding an `OP_`-code SCRIPT to a > `UOP_`-code SCRIPT and having an interpreter loop > over the `UOP_`-code SCRIPT. > > We can use LLVM. We've not long ago gone to the effort of removing openssl as a consensus critical dependency; and likewise previously removed bdb. Introducing a huge new dependency to the definition of consensus seems like an enormous step backwards. This would also mean we'd be stuck at the performance of whatever version of llvm we initially adopted, as any performance improvements introduced in later llvm versions would be a hard fork. > On the other hand, LLVM bugs are compiler bugs and > the same bugs can hit the static compiler `cc`, too, "Well, you could hit Achilles in the heel, so really, what's the point of trying to be invulnerable anywhere else?" > Then we put a pointer to this compiled function to a > 256-long array of functions, where the array index is > the `OP_` code. That's a 256-long array of functions for each microcode, which increases the "microcode-utxo" database storage size substantially. Presuming there are different jit targets (x86 vs arm?) it seems difficulty to come up with a consistent interpretation of the cost for these opcodes. I'm skeptical that a jit would be sufficient for increasing the performance of an implementation just based on basic arithmetic opcodes if we're talking about something like sha512 or bls12-381 or similar. > Bugs in existing microcodes can be fixed by basing a > new microcode from the existing microcode, and > redefining the buggy implementation. > Existing Tapscripts need to be re-spent to point to > the new bugfixed microcode, but if you used the > point-spend branch as an N-of-N of all participants > you have an upgrade mechanism for free. It's not free if you have to do an on-chain spend... The "1 BTC" cost to fix the bug, and the extra storage in every node's "utxo" set because they now have to keep both the buggy and fixed versions around permanently sure isn't free either. If you're re-jitting every microcode on startup, that could get pretty painful too. If you're proposing introducing byte vector manipulation and OP_CAT and similar, which enables recursive covenants, then it might be good to explain how this proposal addresses the concerns raised at the end of https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-March/020092.html Cheers, aj ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks 2022-03-22 23:11 ` Anthony Towns @ 2022-03-23 0:20 ` ZmnSCPxj 0 siblings, 0 replies; 8+ messages in thread From: ZmnSCPxj @ 2022-03-23 0:20 UTC (permalink / raw) To: Anthony Towns; +Cc: Bitcoin Protocol Discussion Good morning aj, > On Tue, Mar 22, 2022 at 05:37:03AM +0000, ZmnSCPxj via bitcoin-dev wrote: > > > Subject: Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks > > (Have you considered applying a jit or some other compression algorithm > to your emails?) > > > Microcode For Bitcoin SCRIPT > > > > ============================= > > > > I propose: > > > > - Define a generic, low-level language (the "RISC language"). > > This is pretty much what Simplicity does, if you optimise the low-level > language to minimise the number of primitives and maximise the ability > to apply tooling to reason about it, which seem like good things for a > RISC language to optimise. > > > - Define a mapping from a specific, high-level language to > > the above language (the microcode). > > > > - Allow users to sacrifice Bitcoins to define a new microcode. > > I think you're defining "the microcode" as the "mapping" here. Yes. > > This is pretty similar to the suggestion Bram Cohen was making a couple > of months ago: > > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2021-December/019722.html > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019773.html > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019803.html > > I believe this is done in chia via the block being able to > include-by-reference prior blocks' transaction generators: > > ] transactions_generator_ref_list: List[uint32]: A list of block heights of previous generators referenced by this block's generator. > > - https://docs.chia.net/docs/05block-validation/block_format > > (That approach comes at the cost of not being able to do full validation > if you're running a pruning node. The alternative is to effectively > introduce a parallel "utxo" set -- where you're mapping the "sacrificed" > BTC as the nValue and instead of just mapping it to a scriptPubKey for > a later spend, you're permanently storing the definition of the new > CISC opcode) > > Yes, the latter is basically what microcode is. > > We can then support a "RISC" language that is composed of > > general instructions, such as arithmetic, SECP256K1 scalar > > and point math, bytevector concatenation, sha256 midstates, > > bytevector bit manipulation, transaction introspection, and > > so on. > > A language that includes instructions for each operation we can think > of isn't very "RISC"... More importantly it gets straight back to the > "we've got a new zk system / ECC curve / ... that we want to include, > let's do a softfork" problem you were trying to avoid in the first place. `libsecp256k1` can run on purely RISC machines like ARM, so saying that a "RISC" set of opcodes cannot implement some arbitrary ECC curve, when the instruction set does not directly support that ECC curve, seems incorrect. Any new zk system / ECC curve would have to be implementable in C++, so if you have micro-operations that would be needed for it, such as XORing two multi-byte vectors together, multiplying multi-byte precision numbers, etc., then any new zk system or ECC curve would be implementable in microcode. For that matter, you could re-write `libsecp256k1` there. > > Then, the user creates a new transaction where one of > > the outputs contains, say, 1.0 Bitcoins (exact required > > value TBD), > > Likely, the "fair" price would be the cost of introducing however many > additional bytes to the utxo set that it would take to represent your > microcode, and the cost it would take to run jit(your microcode script) > if that were a validation function. Both seem pretty hard to manage. > > "Ideally", I think you'd want to be able to say "this old microcode > no longer has any value, let's forget it, and instead replace it with > this new microcode that is much better" -- that way nodes don't have to > keep around old useless data, and you've reduced the cost of introducing > new functionality. Yes, but that invites "I accidentally the smart contract" behavior. > Additionally, I think it has something of a tragedy-of-the-commons > problem: whoever creates the microcode pays the cost, but then anyone > can use it and gain the benefit. That might even end up creating > centralisation pressure: if you design a highly decentralised L2 system, > it ends up expensive because people can't coordinate to pay for the > new microcode that would make it cheaper; but if you design a highly > centralised L2 system, you can just pay for the microcode yourself and > make it even cheaper. The same "tragedy of the commons" applies to FOSS. "whoever creates the FOSS pays the cost, but then anyone can use it and gain the benefit" This seems like an argument against releasing a FOSS node software. Remember, microcode is software too, and copying software does not have a tragedy of the commons --- the main point of a tragedy of the commons is that the commons is *degraded* by the use but nobody has incentive to maintain against the degradation. But using software does not degrade the software, if I give you a copy of my software then I do not lose my software, which is why FOSS works. In order to make a highly-decentralized L2, you need to cooperate with total strangers, possibly completely anonymously, in handling your money. I imagine that the level of cooperation needed in, say, Lightning network, would be far above what is necessary to gather funds from multiple people who want a particular microcode to happen until enough funds have been gathered to make the microcode happen. For example, create a fresh address for an amount you, personally, are willing to contribute in order to make the microcode happen. (If you are willing to spend the time and energy arguing on bitcoin-dev, then you are willing to contribute, even if others get the benefit in addition to yourself, and that time and energy has a corresponding Bitcoin value) Then spend it using a `SIGHASH_ANYONECANPAY | SIGHASH_SINGLE`, with the microcode introduction outpoint as the single output you are signing. Gather enough such signatures from a community around a decentralized L2, and you can achieve the necessary total funds for the microcode to happen. > This approach isn't very composable -- if there's a clever opcode > defined in one microcode spec, and another one in some other microcode, > the only way to use both of them in the same transaction is to burn 1 > BTC to define a new microcode that includes both of them. Yes, that is indeed a problem. > > We want to be able to execute the defined microcode > > faster than expanding an `OP_`-code SCRIPT to a > > `UOP_`-code SCRIPT and having an interpreter loop > > over the `UOP_`-code SCRIPT. > > We can use LLVM. > > We've not long ago gone to the effort of removing openssl as a consensus > critical dependency; and likewise previously removed bdb. Introducing a > huge new dependency to the definition of consensus seems like an enormous > step backwards. > > This would also mean we'd be stuck at the performance of whatever version > of llvm we initially adopted, as any performance improvements introduced > in later llvm versions would be a hard fork. Yes, LLVM is indeed the weak link in this idea. We could use NaCl instead, that has probably fewer issues /s. > > On the other hand, LLVM bugs are compiler bugs and > > the same bugs can hit the static compiler `cc`, too, > > "Well, you could hit Achilles in the heel, so really, what's the point > of trying to be invulnerable anywhere else?" Yes, LLVM is indeed the weak point here. We could just concatenate some C++ code together when a new microcode is introduced, and compile it statically, then store the resulting binary somewhere, and invoke it at the appropriate time to run validation. At least LLVM would be isolated into its own process in that case. > > Then we put a pointer to this compiled function to a > > 256-long array of functions, where the array index is > > the `OP_` code. > > That's a 256-long array of functions for each microcode, which increases > the "microcode-utxo" database storage size substantially. > > Presuming there are different jit targets (x86 vs arm?) it seems > difficulty to come up with a consistent interpretation of the cost for > these opcodes. > > I'm skeptical that a jit would be sufficient for increasing the > performance of an implementation just based on basic arithmetic opcodes > if we're talking about something like sha512 or bls12-381 or similar. Static compilation seems to work well enough --- and JIT vs static is a spectrum, not either/or. The difference is really how much optimization you are willing to use. If microcodes are costly enough that they happen rarely, then using optimizations that are often used only in static compilation, seems a reasonable tradeoff > > Bugs in existing microcodes can be fixed by basing a > > new microcode from the existing microcode, and > > redefining the buggy implementation. > > Existing Tapscripts need to be re-spent to point to > > the new bugfixed microcode, but if you used the > > point-spend branch as an N-of-N of all participants > > you have an upgrade mechanism for free. > > It's not free if you have to do an on-chain spend... > > The "1 BTC" cost to fix the bug, and the extra storage in every node's > "utxo" set because they now have to keep both the buggy and fixed versions > around permanently sure isn't free either. Heh, poor word choice. What I meant is that we do not need a separate upgrade mechanism, the design work here is "free". *Using* the upgrade mechanism is costly and hence not "free". > If you're re-jitting every > microcode on startup, that could get pretty painful too. When LLVM is used in a static compiler, it writes the resulting code on-disk, I imagine the same mechanism can be used. > If you're proposing introducing byte vector manipulation and OP_CAT and > similar, which enables recursive covenants, then it might be good to > explain how this proposal addresses the concerns raised at the end of > https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-March/020092.html It does not, I am currently exploring and generating ideas, not particularly tying myself to one idea or another. Regards, ZmnSCPxj ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2022-03-23 0:20 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-03-22 5:37 [bitcoin-dev] Beyond Jets: Microcode: Consensus-Critical Jets Without Softforks ZmnSCPxj 2022-03-22 15:08 ` Russell O'Connor 2022-03-22 16:22 ` ZmnSCPxj 2022-03-22 16:28 ` Russell O'Connor 2022-03-22 16:39 ` ZmnSCPxj 2022-03-22 16:47 ` ZmnSCPxj 2022-03-22 23:11 ` Anthony Towns 2022-03-23 0:20 ` ZmnSCPxj
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox