From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 8171095D for ; Mon, 17 Apr 2017 13:25:57 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-qt0-f173.google.com (mail-qt0-f173.google.com [209.85.216.173]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 7738F14E for ; Mon, 17 Apr 2017 13:25:56 +0000 (UTC) Received: by mail-qt0-f173.google.com with SMTP id c45so100221993qtb.1 for ; Mon, 17 Apr 2017 06:25:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=h3ppE0L2is3zXAI8Q5S34tl3Yuju8Eqvjp2MknO3UOs=; b=uXHdAst7lVmBBQATTuML2CJnk/PnizCAQIIAoijQAvbeODXaeOJApWC2TrvJt9tcSu 7mqWzF5flIOiDuTQnbsU6i6B+vMy4wn1lXgb2KOz5NsiztrqDiGQKtGWnCHGOkq/o9NO JpybRklWAS38ZZIobZ8TSUXyHkkZKQlCmL6mnSWW60HvCYcS0QQceu+neG2Bo+FWQG+D cuoEddsAp86V8HRYWPJKWv5OI0ObgGt2iC37HTqExioK+nhnF1Z1PF9xZdQgDyZlvgep hsWz0LMaO1jchB+idLEmZSSjjnkpT061FfeTrT/t3kVMs6z2+Ff9VW+GWuQG6flQrta/ FrdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=h3ppE0L2is3zXAI8Q5S34tl3Yuju8Eqvjp2MknO3UOs=; b=bioSaoaFxMSeDukQ6aWzqn565V/VWfw87NZOZN7JzOy23n9TnJNBVQLWbyU5xGgmoU mEgfnWxBO8zYPvdv1d9BNJBbVCfhlKzLQxLiU3OICJR9aOE6KCRU+s7u0N3gggKeOvRw DVp78ixpKmdR8FGpV+Y8xiX7u2rKAtayeS9wBQTWO7o1Iwi1QU1qqbd953WJbDPJ8udF MCmJtRDs0f+x6mC2F0IVxSaHBUoLIiS6UZxRl/RYV/trsSgHeQwG3gU70Cac1Y6H7QOS rzbl+CaZT7CvLrS2ewOib3FViJzFCVISzPtD3X0VZjO6Beu/ghRSVg/xHWWLsWbKeAcZ OD0A== X-Gm-Message-State: AN3rC/7unHFW3OEjGJL+JqXL0XU7HhbYAEwhAzfRscGQnNkpXg0PkxCv 0CxGnEWn9Zfex47WbjHcgp9uj+PyZ4x5Bd0= X-Received: by 10.200.3.15 with SMTP id q15mr9761337qtg.221.1492435555247; Mon, 17 Apr 2017 06:25:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.176.174 with HTTP; Mon, 17 Apr 2017 06:25:14 -0700 (PDT) From: Sergio Demian Lerner Date: Mon, 17 Apr 2017 10:25:14 -0300 Message-ID: To: bitcoin-dev Content-Type: multipart/alternative; boundary=f4030435c97c1ee9db054d5cbb84 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Subject: [bitcoin-dev] Suggestions to improve opcodes with O(N) complexity X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Apr 2017 13:25:57 -0000 --f4030435c97c1ee9db054d5cbb84 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I came across O(N) behavior of two scripting opcodes, OP_IF and OP_ROLL. By exploiting edge cases for each of these two sub-optimal algorithms, I manage to simulate a Segwit block that takes up to 5.6 seconds to verify on a Ubuntu VM running on a single Core i5 processor. The simulation is based on a single thread executing EvalScript(), the Bitcoin script execution function. The tests were not performed processing actual blocks. These results should not make anyone worry, because there are worse problems in Bitcoin block verification, and because Bitcoin employs several worker threads for verifying scripts in parallel. For example, a Segwit block can request 80000 signature verifications when all transactions are P2WSH. It is said that Bitcoin Core (in a modern multi-core machine, using its multi-threading verification capabilities) can verify 8000 ECDSA signatures per second. Therefore a malicious miner can create a Segwit block that requires approximately 10 seconds to be verified. Since the examples presented in this post consume less than 10 seconds, I don=E2=80=99t consid= er my findings as vulnerabilities. However, if the block size is to be increased in the future, these problems should be considered prior increasing the block size. The scripts presented here as examples do not leave the value stack empty, but the Bitcoin protocol does not require it. Bitcoin only requires the top value to be true to accept the script. OP_IF abuse Every time a Bitcoin script executes the OP_IF opcode, a boolean value indicating if the condition was true, false or the conditional was skipped (also represented as false) is pushed into the vfExec stack. Every time an opcode is executed, the number of false values in the vfExec stack is counted using the following line: bool fExec =3D !count(vfExec.begin(), vfExec.end(), false); If the count is non-zero, all subsequent instructions except OP_ELSE and OP_ENDIF are skipped. It is clear that the longer the conditional stack is, the more it takes to count the false elements. The following scriptPub or ScriptSig exploits this problem: 0 OP_IF { 100 times } 0 { 9798 times } OP_ENDIF { 100 times } 1 The vfExec vector is filled with 100 elements, and then each element is scanned 9799 times, totaling more than 979K items scanned. This took 2.5 seconds in my test VM (for a block filled with these scriptSigs). To re-write this logic with a O(1) algorithm, one simply has to count the number of true conditions in one variable (trueCount), and the number of false or skipped conditions following all true conditions in another (ignoreCount). Detecting if code needs to be executed or not requires just testing if ignoreCount is zero. The handling of OP_IF / OP_NOTIF / OP_ELSE should be like the following pseudo-code: fExec =3D (ignoreCount=3D=3D0); ... case OP_IF: case OP_NOTIF: { if (fExec) { ....compute fValue... if (fValue) trueCount++; else ignoreCount++; } else ignoreCount++; } break; case OP_ELSE: { if ((trueCount=3D=3D0) && (ignoreCount=3D=3D0)) return set_error(serror, SCRIPT_ERR_UNBALANCED_CONDITIONAL); if (ignoreCount=3D=3D0) { trueCount--; ignoreCount++; } else if (ignoreCount=3D=3D1) { trueCount++; ignoreCount--; } } break; case OP_ENDIF: { if ((trueCount=3D=3D0) && (ignoreCount=3D=3D0)) return set_error(serror, SCRIPT_ERR_UNBALANCED_CONDITIONAL); if (ignoreCount>0) ignoreCount--; else trueCount--; } break; You may have noticed the strange behavior of Bitcoin=E2=80=99s ELSE stateme= nt. Bitcoin allows one to switch between true and false conditions several times. For example, the following script is valid and leaves the value 2 on the stack: 1 OP_IF OP_ELSE OP_ELSE 2 OP_ENDIF OP_ROLL The second problem lies in the OP_ROLL opcode. This opcode removes a value at a given index from the value stack, and pushes it on top. As the Bitcoin Core stack stores a list of char std::vector by value (not by reference), and because the stack is itself a std::vector (not a linked list), then removing the first elements requires moving all elements one position in memory. The value stack can store a maximum of 1000 elements. The following script fills the stack and then moves each stack element 200 times, so the number of moved elements is 200K. This took almost 5.6 seconds in my test VM (for a block filled with these scriptSigs). 1 {999 times} 998 OP_ROLL { 200 times } I tried other scripts, such as filling the stack with values of size 520 using DUP3, and then performing rolls, but all of them led to a block that took less time, if the block is to be filled with the scripts. One solution to this problem is use a linked list data structure instead of a std::vector, to allow O(1) removal of items, but it still requires O(N) for element lookup. A balanced tree where each internal node is augmented with the number of children underneath can be used to provide efficient indexed access and efficient element removal. However, the overhead of such data structure may kill its benefits. So it may be the case that optimizing OP_ROLL will never really be required. But these minor issues have to be taken into account if the scripting system is modified in any way. There are many subtle interactions. For instance, it may seem that Segwit allows a transaction having a stack containing 2 million items to verify correctly, by having the witness stack filled with 2M zero values, and by executing an empty witness script. However this is prevented by the cleanstack check. --f4030435c97c1ee9db054d5cbb84 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

= I came across O(N) behavior of two scripting opcodes, OP_IF and OP_ROLL. By= exploiting edge cases for each of these two sub-optimal algorithms, I mana= ge to simulate a Segwit block that takes up to 5.6 seconds to verify on a U= buntu VM running on a single Core i5 processor. The simulation is based on = a single thread executing EvalScript(), the Bitcoin script execution functi= on. The tests were not performed processing actual blocks. These results sh= ould not make anyone worry, because there are worse problems in Bitcoin blo= ck verification, and because Bitcoin employs several worker threads for ver= ifying scripts in parallel. For example, a Segwit block can request 80000 s= ignature verifications when all transactions are P2WSH. It is said that Bit= coin Core (in a modern multi-core machine, using its multi-threading verifi= cation capabilities) can verify 8000 ECDSA signatures per second. Therefore= a malicious miner can create a Segwit block that requires approximately 10= seconds to be verified. Since the examples presented in this post consume = less than 10 seconds, I don=E2=80=99t consider my findings as vulnerabiliti= es. However, if the block size is to be increased in the future, these prob= lems should be considered prior increasing the block size. The scripts pres= ented here as examples do not leave the value stack empty, but the Bitcoin = protocol does not require it. Bitcoin only requires the top value to be tru= e to accept the script.

OP_IF abuse

Every time a Bitcoin scri= pt executes the OP_IF opcode, a boolean value indicating if the condition w= as true, false or the conditional was skipped (also represented as false) i= s pushed into the vfExec stack.=C2=A0 Every time an opcode is executed, the= number of =C2=A0false values in the vfExec stack is counted using the foll= owing line:

bool fExec =3D !count(vfExec.begin(), vfExec.end(), fals= e);

If the count is non-zero, all subsequent instructions except OP_= ELSE and OP_ENDIF are skipped. It is clear that the longer the conditional = stack is, the more it takes to count the false elements.

The followi= ng scriptPub or ScriptSig exploits this problem:

0
OP_IF { 100 ti= mes }

0 { 9798 times }

OP_ENDIF { 100 times }
1

The= vfExec vector is filled with 100 elements, and then each element is scanne= d 9799 times, totaling more than 979K items scanned. This took 2.5 seconds = in my test VM (for a block filled with these scriptSigs).

To re-writ= e this logic with a O(1) algorithm, one simply has to count the number of t= rue conditions in one variable (trueCount), and the number of false or skip= ped conditions following all true conditions in another (ignoreCount). Dete= cting if code needs to be executed or not requires just testing if ignoreCo= unt is zero.

The handling of OP_IF / OP_NOTIF / OP_ELSE should be li= ke the following pseudo-code:

fExec =3D (ignoreCount=3D=3D0);
...=
case OP_IF:
case OP_NOTIF:
=C2=A0{
=C2=A0 =C2=A0if (fExec)
= =C2=A0 =C2=A0 {
=C2=A0 =C2=A0 =C2=A0....compute fValue...
=C2=A0 =C2= =A0 =C2=A0if (fValue) trueCount++; else ignoreCount++;
=C2=A0 =C2=A0 } e= lse
=C2=A0 =C2=A0 ignoreCount++;
=C2=A0} break;
=C2=A0case OP_ELS= E:
=C2=A0{
=C2=A0if ((trueCount=3D=3D0) && (ignoreCount=3D=3D= 0))
=C2=A0 =C2=A0 =C2=A0return set_error(serror, SCRIPT_ERR_UNBALANCED_C= ONDITIONAL);
=C2=A0if (ignoreCount=3D=3D0) { trueCount--; ignoreCount++;= } else
=C2=A0if (ignoreCount=3D=3D1) { trueCount++; ignoreCount--; }=C2=A0} break;
case OP_ENDIF:
=C2=A0{
=C2=A0 =C2=A0 if ((trueCoun= t=3D=3D0) && (ignoreCount=3D=3D0))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 r= eturn set_error(serror, SCRIPT_ERR_UNBALANCED_CONDITIONAL);
=C2=A0 =C2= =A0 if (ignoreCount>0) ignoreCount--; else trueCount--;
=C2=A0}
= =C2=A0break;

You may have noticed the strange behavior of Bitcoin=E2= =80=99s ELSE statement. Bitcoin allows one to switch between true and false= conditions several times. For example, the following script is valid and l= eaves the value 2 on the stack:

1 OP_IF OP_ELSE OP_ELSE 2 OP_ENDIF
OP_ROLL

The second problem lies in the OP_ROLL opcode. This op= code removes a value at a given index from the value stack, and pushes it o= n top. As the Bitcoin Core stack stores a list of char std::vector by value= (not by reference), and because the stack is itself a std::vector (not a l= inked list), then removing the first elements requires moving all elements = one position in memory. The value stack can store a maximum of 1000 element= s. The following script fills the stack and then moves each stack element 2= 00 times, so the number of moved elements is 200K. This took almost 5.6 sec= onds in my test VM (for a block filled with these scriptSigs).

1 =C2= =A0{999 times}
998 OP_ROLL { 200 times }

I tried other scripts, s= uch as filling the stack with values of size 520 using DUP3, and then perfo= rming rolls, but all of them led to a block that took less time, if the blo= ck is to be filled with the scripts.

One solution to this problem is= use a linked list data structure instead of a std::vector, to allow O(1) r= emoval of items, but it still requires O(N) for element lookup. A balanced = tree where each internal node is augmented with the number of children unde= rneath can be used to provide efficient indexed access and efficient elemen= t removal. However, the overhead of such data structure may kill its benefi= ts.

So it may be the case that optimizing OP_ROLL will never really = be required.=C2=A0

But these minor issues have to be taken into account if the scripting = system is modified in any way. There are many subtle interactions. For inst= ance, it may seem that Segwit allows a transaction having a stack containin= g 2 million items to verify correctly, by having the witness stack filled w= ith 2M zero values, and by executing an empty witness script. However this = is prevented by the cleanstack check.=C2=A0


--f4030435c97c1ee9db054d5cbb84--