public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Jim Posen <jim.posen@gmail.com>
To: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: [bitcoin-dev] Parallel header download during sync
Date: Fri, 15 Dec 2017 15:55:16 -0800	[thread overview]
Message-ID: <CADZtCSjR8Cd30Ag__+Rr2FV8rdbYY4_bUgGKeEgpoyTjxzM=dw@mail.gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2911 bytes --]

One of the ideas that Greg Maxwell brought up in the "'Compressed' headers
stream" thread is the possibility of a header sync mechanism that allowed
parallel download from multiple peers. With the current getheaders/headers
semantics, headers must be downloaded sequentially from genesis. In my
testing, I saw that syncing headers directly from a colocated node took <5s
whereas syncing normally from network peers takes ~5 min for me, which goes
to show that 5s is an upper bound on the time to process all headers if
they are locally available. So if we can introduce new p2p messages for
header sync, what would they look like? Here's one idea.

A new getheadersv2 request would include a start height for the range of
headers requested and a commitment to the last block in the chain that you
want to download. Then you find N peers that are all on the same chain,
partition the range of headers from 0 to the chain height minus some
reasonable reorg safety buffer (~6 blocks), and send download requests in
parallel. So how do we know that the peers are on the same chain and that
their headers served connect into this chain?

When you connect to outbound peers and are in IBD, you will query them for
a Merkle Mountain Range commitment to all headers up to a height X (which
is 6ish blocks before their start height from the version message). Then
you choose the commitment that the majority of the queried peers sent (or
some other heuristic), and these become your download peers. Every
getheadersv2 request includes the start height, X, and the chain
commitment. The headersv2 response messages include all of the headers
followed by a merkle branch linking the last header into the chain
commitment. Headers are processed in order as they arrive and if any of the
headers are invalid, you can ban/disconnect all peers that committed to it,
drop the buffer of later headers and start over.

That's the basic idea. Here are other details:

- This would require an additional 32-byte MMR commitment for each header
in memory.
- When a node receives a headersv2 request and constructs a merkle proof
for the last header, it checks against the sent commitment. In the case of
a really deep reorg, that check would fail, and the node can instead
respond with an updated commitment hash for that height.
- Another packet is needed, getheaderchain or something, that a syncing
peer first sends along with a header locator and an end height. The peer
responds with headerchain, which includes the last common header from the
locator along with the chain commitment at that height and a merkle branch
proving inclusion of that header in the chain.
- Nodes would cache chain commitments for the last ~20 blocks (somewhat
arbitrary), and refuse to serve chain commitments for heights before that.

Thoughts? This is a pretty recycled idea, so please point me at prior
proposals that are similar as well.

-jimpo

[-- Attachment #2: Type: text/html, Size: 3136 bytes --]

                 reply	other threads:[~2017-12-15 23:55 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADZtCSjR8Cd30Ag__+Rr2FV8rdbYY4_bUgGKeEgpoyTjxzM=dw@mail.gmail.com' \
    --to=jim.posen@gmail.com \
    --cc=bitcoin-dev@lists.linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox