public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Pieter Wuille <pieter.wuille@gmail.com>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: Bitcoin Dev <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] Rolling UTXO set hashes
Date: Tue, 23 May 2017 13:43:45 -0700	[thread overview]
Message-ID: <CAPg+sBgW7paof_rLunoDYJL_WXn7mYuHanuka8a_x0oE2LjwSQ@mail.gmail.com> (raw)
In-Reply-To: <8760gs2n7v.fsf@rustcorp.com.au>

On Mon, May 22, 2017 at 9:47 PM, Rusty Russell <rusty@rustcorp.com.au> wrote:
> Gregory Maxwell via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> writes:
>> On Tue, May 16, 2017 at 6:17 PM, Pieter Wuille <pieter.wuille@gmail.com> wrote:
>>> just the first - and one that has very low costs and no normative
>>> datastructures at all.
>>
>> The serialization of the txout itself is normative, but very minimal.
>
> I do prefer the (2) approach, BTW, as it reuses existing primitives, but
> I know "simpler" means a different thing to mathier brains :)

Oh, I didn't mean it that way at all. (1) is simpler to get decent
performance out of. Implementing (1) using any language that has big
integer support or can link against GMP is likely going to be faster
than the fastest possible implementation of (2).

> Since it wasn't explicit in the proposal, I think the txout information
> placed in the hash here is worth discussing.
>
> I prefer a simple txid||outnumber[1], because it allows simple validation
> without knowing the UTXO set itself; even a lightweight node can assert
> that UTXOhash for block N+1 is valid if the UTXOhash for block N is
> valid (and vice versa!) given block N+1.  And miners can't really use
> that even if they were to try not validating against UTXO (!) because
> they need to know input amounts for fees (which are becoming
> significant).
>
> If I want to hand you the complete validatable UTXO set, I need to hand
> you all the txs with any unspent output, and some bitfield to indicate
> which ones are unspent.

That seems to completely defeat the purpose... if I want to give you a
UTXO set, and prove its correctness wrt the hash you know... I need to
remember the full transactions those outputs came from?

> OTOH, if you serialize more (eg. ...||amount||scriptPubKey ?), then the UTXO
> set size needed to validate the utxohash is a little smaller: you need
> to send the txid, but not the tx nVersion, nLocktime or inputs.  But in a
> SegWit world, that's actually *bigger* AFAICT.

That's an interesting idea, but I believe you're forgetting:
* The size of txin prevout/nsequence, which is typically larger than
txouts (even when excluding scriptSig/witness data).
* The size of spent txouts for transactions with unspent outputs left.
* The fact that you can deduplicate the txids for txn that have
multiple unspent outputs in the UTXO set serialization, even if that
txid is repeated in the rolling hash computation.

The construction I was considering and benchmarking is using 256-bit
truncated SHA512(256bit txid || 32bit voutindex || 1bit coinbase ||
31bit height || CTxOut output) as secp256k1 X coordinate, or as key to
seed a ChaCha20 PRNG whose outputs is the 3072-bit MuHash number. The
reason for using SHA512 is that it can process most UTXOs in a single
transformation (as opposed to SHA256 which will almost always need 2).
The reason for using ChaCha20 is that it's incredibly fast for
producing much data when a key is already known. An alternative is
using SHAKE256 for the whole construction (as it both takes an
arbitrary amount of data, and produces an arbitrary length hash) - but
it's a bit slower.

> Thanks,
> Rusty.
>
> [1] I think you could actually use txid^outnumber, and if that's not a
>     curve point SHA256() again, etc.  But I don't think that saves any
>     real time, and may cause other issues.

That just seems scary to me...

-- 
Pieter


      reply	other threads:[~2017-05-23 20:43 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-15 20:01 [bitcoin-dev] Rolling UTXO set hashes Pieter Wuille
2017-05-15 20:53 ` Peter R
2017-05-15 23:04 ` ZmnSCPxj
2017-05-15 23:59   ` Gregory Maxwell
2017-05-16  0:15     ` ZmnSCPxj
2017-05-16 11:01     ` Peter Todd
2017-05-16 18:17       ` Pieter Wuille
2017-05-16 18:20         ` Gregory Maxwell
2017-05-23  4:47           ` Rusty Russell
2017-05-23 20:43             ` Pieter Wuille [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPg+sBgW7paof_rLunoDYJL_WXn7mYuHanuka8a_x0oE2LjwSQ@mail.gmail.com \
    --to=pieter.wuille@gmail.com \
    --cc=bitcoin-dev@lists.linuxfoundation.org \
    --cc=rusty@rustcorp.com.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox