Re: [bitcoin-dev] Optimized Header Sync

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

From: Jim Posen <jim.posen@gmail.com>
To: Riccardo Casatta <riccardo.casatta@gmail.com>
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] Optimized Header Sync
Date: Mon, 2 Apr 2018 22:34:39 -0700	[thread overview]
Message-ID: <CADZtCShrTKqR26RMnUAwK32_7XKNvsCWgfYvQ3Bud2J6r54AKg@mail.gmail.com> (raw)
In-Reply-To: <CADabwBDiT2zNPHZ2=OyvCVrSY3Km2oyTnhRCHyMNsjW2vLMmOg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4563 bytes --]

Thank you for your feedback AJ and Riccardo.

Nice observation about using nBits from every 2016th block as a short
specifier of chain work. You can get some savings from the 4 byte nBits
encoding over VLQ for total chain work as in my spec.

I tried it out on the current chain. At block height 516,387, there are 258
total checkpoints in the response payload with an interval of 2016. The
size of the checkpts message is:

- 9,304 bytes using hash + nBits
- 10,934 bytes using hash + chain work delta encoded as VLQ
- 11,030 bytes using hash + chain work total encoded as VLQ

The saving from using deltas instead of the total seems negligible to me
especially considering the additional computation it requires. Going from
total chain work as VLQ to nBits is a 16% savings in the size of a checkpts
message. According to some rather rough benchmarks, it takes ~3us to
generate the message with nBits versus ~105us to generate each message with
VLQ chain work (including block index lookups and serialization time).

The downside, however, is that the new P2P message would be tightly coupled
to a specific parameter in Bitcoin's consensus protocol, and one that is
changed in many alt chains. Also, it would require that checkpoints can
only be fetched at intervals of 2016, instead of intervals chosen by the
clients. Being able to specify the interval is a very nice property for
longer chains, where a client may select really large intervals, then
bisect that range even further to request a smaller PoW sample (eg. start
by fetching every 10,000th, then every 100th).

Personally, I strongly think using total chain work instead of nBits is the
right tradeoff and is worth the extra 1KB. I'm curious to hear others'
opinions. Note that the checkpoints message is only fetched once per peer
per download from genesis. Subsequent catchups only fetch checkpoints from
the locator fork point. I also don't find the caching argument compelling
-- the time to generate checkpts response messages is fast enough anyway.

I also finally got around to pulling numbers on the space savings from the
nVersion omission. As a reminder of how this works, three bits in the
encoding indicator represent a value 1-7 of the distance in block height
since another block with the same version. Looking at the current Bitcoin
main chain, this is a table of the occurrences of these values:

Height distance # of Blocks
1 469537
2 22301
3 8833
4 4368
5 2633
6 1630
7 1114
                     8+ 5967
You can read this as "469,537 blocks have the same version as their
parent", "22,301 have the same version as their parent's parent", etc.
Given the information in this table, we may consider only allocating 2 bits
in the encoding header rather than 3.

On Fri, Mar 30, 2018 at 1:06 AM, Riccardo Casatta <
riccardo.casatta@gmail.com> wrote:

> Yes, I think the checkpoints and the compressed headers streams should be
> handled in chunks of 2016 headers and queried by chunk number instead of
> height, falling back to current method if the chunk is not full yet.
>
> This is cache friendly and allows to avoid bit 0 and bit 1 in the bitfield
> (because they are always 1 after the first header in the chunk of 2016).
>
> 2018-03-30 8:14 GMT+02:00 Anthony Towns <aj@erisian.com.au>:
>
>> On Thu, Mar 29, 2018 at 05:50:30PM -0700, Jim Posen via bitcoin-dev wrote:
>> > Taken a step further though, I'm really interested in treating the
>> checkpoints
>> > as commitments to chain work [...]
>>
>> In that case, shouldn't the checkpoints just be every 2016 blocks and
>> include the corresponding bits value for that set of blocks?
>>
>> That way every node commits to (approximately) how much work their entire
>> chain has by sending something like 10kB of data (currently), and you
>> could verify the deltas in each node's chain's target by downloading the
>> 2016 headers between those checkpoints (~80kB with the proposed compact
>> encoding?) and checking the timestamps and proof of work match both the
>> old target and the new target from adjacent checkpoints.
>>
>> (That probably still works fine even if there's a hardfork that allows
>> difficulty to adjust more frequently: a bits value at block n*2016 will
>> still enforce *some* lower limit on how much work blocks n*2016+{1..2016}
>> will have to contribute; so will still allow you to estimate how much work
>> will have been done, it may just be less precise than the estimate you
>> could
>> generate now)
>>
>> Cheers,
>> aj
>>
>>
>
>
> --
> Riccardo Casatta - @RCasatta <https://twitter.com/RCasatta>
>

[-- Attachment #2: Type: text/html, Size: 9269 bytes --]

     prev parent reply	other threads:[~2018-04-03  5:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-27 23:31 [bitcoin-dev] Optimized Header Sync Jim Posen
2018-03-29  8:17 ` Riccardo Casatta
2018-03-30  0:50   ` Jim Posen
2018-03-30  6:14     ` Anthony Towns
2018-03-30  8:06       ` Riccardo Casatta
2018-04-03  5:34         ` Jim Posen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADZtCShrTKqR26RMnUAwK32_7XKNvsCWgfYvQ3Bud2J6r54AKg@mail.gmail.com \
    --to=jim.posen@gmail.com \
    --cc=bitcoin-dev@lists.linuxfoundation.org \
    --cc=riccardo.casatta@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox