From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 36D6B9C for ; Thu, 6 Apr 2017 23:37:52 +0000 (UTC) X-Greylist: whitelisted by SQLgrey-1.7.6 Received: from mail-pg0-f50.google.com (mail-pg0-f50.google.com [74.125.83.50]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id A76F0F0 for ; Thu, 6 Apr 2017 23:37:51 +0000 (UTC) Received: by mail-pg0-f50.google.com with SMTP id x125so49169766pgb.0 for ; Thu, 06 Apr 2017 16:37:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=voskuil-org.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-transfer-encoding; bh=K7+abdoWkXIK/dNMjmIKBHDsphZy6aVRWxmKKgvPxQM=; b=xm6mXEVTrW7Rz/dqmmsYvxBi23B8xuzzAmoEJ8xtpIOp4MU58KQx15Zi+jPknpcErm dfGEOv2gitqzHKo0L9aUZD+Dn6jZ6/sizJCmCKT04vafVxj5TiP9rrK5hfVg2C9l3Lv8 2MmWP35ZpLiVENB3Lxk38qoOIooHWd/1+G/QPlPst+746V6IiiIpHyepUIjPmm8te/X+ Wsbahw0UrO7IB4k0yJfZLVUUkAnRii7qkW7vvS3eynh/hr4B8X9KpjUT2sgY0bLmy1Eh 0hwxgsb2KwdmUFgwi/udMdoXXSaUua80bMk0fpjGGMu+2a11pxM9daCxgSguKUsCTKaq Ctqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding; bh=K7+abdoWkXIK/dNMjmIKBHDsphZy6aVRWxmKKgvPxQM=; b=Z47oMnYnNm7d/GbiOWjWQ+T30aDJ5TeNgfpyOwdsTynp/+6/Q7FwU0jJb8bwk5TlTz Dp7Q0iPInyKnRbvDG59dccwl8q1Jy+WDoFtEHU4l1NrO5KAhClrAJd5uu+XT6VGaI1Ah TiZNbuLgGfU5zqPmuCeawxpVPE54mdZMyFcCfcegQNYIbWYHC1BSa6c/SlAT7/Loh+Oe u8fB1O6NsX97cO+/cBG0AnZbbb6/3qiWG+Un9BHk2uxBzXH3Zrvxmwzv6pmQIIH3XdES IpG/xVXycW1shJZvYHaW4yG7QhV9FGeNkICjYSd1L81xm1dNYfDsJxOqb17JOzSWSlm0 izDw== X-Gm-Message-State: AFeK/H3khrqbbwAKvlLUo8dkjtCs7lBXQ6k0t9K7+7BMrzTMneNWHdG6CKjqh0nWpICOBA== X-Received: by 10.99.37.1 with SMTP id l1mr8627579pgl.86.1491521871206; Thu, 06 Apr 2017 16:37:51 -0700 (PDT) Received: from ?IPv6:2601:600:9000:d69e:f4e5:6be7:b661:fcc3? ([2601:600:9000:d69e:f4e5:6be7:b661:fcc3]) by smtp.gmail.com with ESMTPSA id x21sm5617257pfa.71.2017.04.06.16.37.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 06 Apr 2017 16:37:50 -0700 (PDT) To: Tomas , Bitcoin Protocol Discussion , Libbitcoin Development References: <1491516747.3791700.936828232.69F82904@webmail.messagingengine.com> From: Eric Voskuil X-Enigmail-Draft-Status: N1110 Message-ID: Date: Thu, 6 Apr 2017 16:38:23 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <1491516747.3791700.936828232.69F82904@webmail.messagingengine.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org X-Mailman-Approved-At: Thu, 06 Apr 2017 23:46:13 +0000 Subject: Re: [bitcoin-dev] Using a storage engine without UTXO-index X-BeenThere: bitcoin-dev@lists.linuxfoundation.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: Bitcoin Protocol Discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Apr 2017 23:37:52 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 On 04/06/2017 03:12 PM, Tomas via bitcoin-dev wrote: Hi Tomas, > I have been working on a bitcoin implementation that uses a > different approach to indexing for verifying the order of > transactions. Instead of using an index of unspent outputs, double > spends are verified by using a spend-tree where spends are scanned > against spent outputs instead of unspent outputs. This is the approach that genjix used in libbitcoin version2. With the exception of de-linking (not deleted) in the case of reorgs, the entire store is append only, implemented in a small set of memory mapped files. The downsides to the approach are: (1) higher than necessary storage space requirement due to storing the indexing data required for correlate the spends, and (2) higher than necessary validation complexity and cost in terms of computing the spent-ness (including spender height) of an output. His implementation used a hash table, so performance-wise it did quite well and would theoretically outperform a tree, O(1) vs. O(log2(N)). > This allows for much better concurrency, as not only blocks, but > also individual inputs can be verified fully in parallel. I was successful in parallelizing input validation (across the inputs of an unconfirmed tx and across the set of all inputs in a block) using the v2 store. However, it is not the case that the spends approach is necessary for concurrency. To resolve the above two problems the version3 store does not use a spends table/index. Nor does it store any table of UTXOs. Yet validation is highly parallelized. Instead of additional indexes it uses the tx hash table, augmented with 32 bits per output for spender height. So there is a O(1) cost of finding the tx and a O(N) cost of finding the spender height where N is the number of outputs in the tx. But because the number of outputs in a tx is bounded (by block size) this is constant time in the number of transactions. This works out much faster than the spends table, and without the storage cost or complexity disadvantages. It also scales with available hardware, as the memory mapped files become in-memory hash tables. For low memory machines we found it was important to implement an opaque UTXO cache to limit paging, but for higher end systems zero cache is optimal. > I am sharing this not only to ask for your feedback, but also to > call for a clear separation of protocol and implementations: As > this solution, reversing the costs of outputs and inputs, seems to > have excellent performance characteristics (as shown in the test > results), updates to the protocol addressing the UTXO growth, might > not be worth considering *protocol improvements* and it might be > best to address these concerns as implementation details. I don't follow this part, maybe you could clarify. A spends index grows with the size of the spend set (forever) as it cannot be pruned, which certainly exceeds the size of the UTXO set (unless nothing is spent). The advantage is that you don't have to keep rewriting the store when you use a spends set (because the store can be append only). Feel free to message me if you'd like to discuss in more detail, or to continue on the libbitcoin mailing list (copied). e -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAEBCAAGBQJY5tFpAAoJEDzYwH8LXOFOcMgH/2mw5iOvUYNwvZ2z0KKTSUOA Pd8d5mKoWvd94QxhQ+RyTbkEkMhHl75+zcBgRsfUTtZlBIe/Z0+OgVIN6ibEw+WD w7k3HqgQi9gLgydEelxTAX+z3dJ24n4kCCdKAmZbBuK+Yr/7AViugbEqYemKepku pRWZZS74MUvrYesc0xPn4Ao3DTzMjjY0K2mkuqV8jlwdfZjlAQX9pTx+iSCuMhkd HJ8w7s8QnjVnUeOlLe29mZwaFJPyOTLJMqgDE6s2sXacAy5QQbVCatygvDQ8A/wC ktBnKPFb2lGX3bGKu/KwABegBy/hyec+NP0wFR+0MVivCwTK1+SjeHu5MNOSVlM= =tfVj -----END PGP SIGNATURE-----