From: Erik Aronesty <erik@q32.com>
To: Danny Thorpe <danny.thorpe@gmail.com>
Cc: Bitcoin Dev <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] Small Nodes: A Better Alternative to Pruned Nodes
Date: Thu, 20 Apr 2017 11:50:24 -0400 [thread overview]
Message-ID: <CAJowKg++8GD3gE15pdwe0Bj-L0A6MAzG0_uTSLASaRT9yVb1aQ@mail.gmail.com> (raw)
In-Reply-To: <CAJN5wHW=p+q+DT9R=uheLxOjKBX=xcB+fOZR2KACgJO9SdXypw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 8628 bytes --]
Try to find 1TB dedicated server hosting ...
If you want to set up an ecommerce site somewhere besides your living room,
storage costs are still a concern.
On Mon, Apr 17, 2017 at 3:11 AM, Danny Thorpe via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:
> 1TB HDD is now available for under $40 USD. How is the 100GB storage
> requirement preventing anyone from setting up full nodes?
>
> On Apr 16, 2017 11:55 PM, "David Vorick via bitcoin-dev" <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>> *Rationale:*
>>
>> A node that stores the full blockchain (I will use the term archival
>> node) requires over 100GB of disk space, which I believe is one of the most
>> significant barriers to more people running full nodes. And I believe the
>> ecosystem would benefit substantially if more users were running full nodes.
>>
>> The best alternative today to storing the full blockchain is to run a
>> pruned node, which keeps only the UTXO set and throws away already verified
>> blocks. The operator of the pruned node is able to enjoy the full security
>> benefits of a full node, but is essentially leeching the network, as they
>> performed a large download likely without contributing anything back.
>>
>> This puts more pressure on the archival nodes, as the archival nodes need
>> to pick up the slack and help new nodes bootstrap to the network. As the
>> pressure on archival nodes grows, fewer people will be able to actually run
>> archival nodes, and the situation will degrade. The situation would likely
>> become problematic quickly if bitcoin-core were to ship with the defaults
>> set to a pruned node.
>>
>> Even further, the people most likely to care about saving 100GB of disk
>> space are also the people least likely to care about some extra bandwidth
>> usage. For datacenter nodes, and for nodes doing lots of bandwidth, the
>> bandwidth is usually the biggest cost of running the node. For home users
>> however, as long as they stay under their bandwidth cap, the bandwidth is
>> actually free. Ideally, new nodes would be able to bootstrap from nodes
>> that do not have to pay for their bandwidth, instead of needing to rely on
>> a decreasing percentage of heavy-duty archival nodes.
>>
>> I have (perhaps incorrectly) identified disk space consumption as the
>> most significant factor in your average user choosing to run a pruned node
>> or a lite client instead of a full node. The average user is not typically
>> too worried about bandwidth, and is also not typically too worried about
>> initial blockchain download time. But the 100GB hit to your disk space can
>> be a huge psychological factor, especially if your hard drive only has
>> 500GB available in the first place, and 250+ GB is already consumed by
>> other files you have.
>>
>> I believe that improving the disk usage situation would greatly benefit
>> decentralization, especially if it could be done without putting pressure
>> on archival nodes.
>>
>> *Small Nodes Proposal:*
>>
>> I propose an alternative to the pruned node that does not put undue
>> pressure on archival nodes, and would be acceptable and non-risky to ship
>> as a default in bitcoin-core. For lack of a better name, I'll call this new
>> type of node a 'small node'. The intention is that bitcoin-core would
>> eventually ship 'small nodes' by default, such that the expected amount of
>> disk consumption drops from today's 100+ GB to less than 30 GB.
>>
>> My alternative proposal has the following properties:
>>
>> + Full nodes only need to store ~20% of the blockchain
>> + With very high probability, a new node will be able to recover the
>> entire blockchain by connecting to 6 random small node peers.
>> + An attacker that can eliminate a chosen+ 95% of the full nodes running
>> today will be unable to prevent new nodes from downloading the full
>> blockchain, even if the attacker is also able to eliminate all archival
>> nodes. (assuming all nodes today were small nodes instead of archival nodes)
>>
>> Method:
>>
>> A small node will pick an index [5, 256). This index is that node's
>> permanent index. When storing a block, instead of storing the full block,
>> the node will use Reed-Solomon coding to erasure code the block using a
>> 5-of-256 scheme. The result will be 256 pieces that are 20% of the size of
>> the block each. The node picks the piece that corresponds to its index, and
>> stores that instead. (Indexes 0-4 are reserved for archival nodes -
>> explained later)
>>
>> The node is now storing a fragment of every block. Alone, this fragment
>> cannot be used to recover any piece of the blockchain. However, when paired
>> with any 5 unique fragments (fragments of the same index will not be
>> unique), the full block can be recovered.
>>
>> Nodes can optionally store more than 1 fragment each. At 5 fragments, the
>> node becomes a full archival node, and the chosen indexes should be 0-4.
>> This is advantageous for the archival node as the encoded data for the
>> first 5 indexes will actually be identical to the block itself - there is
>> no computational overhead for selecting the first indexes. There is also no
>> need to choose random indexes, because the full block can be recovered no
>> matter which indexes are chosen.
>>
>> When connecting to new peers, the indexes of each peer needs to be known.
>> Once peers totaling 5 unique indexes are discovered, blockchain download
>> can begin. Connecting to just 5 small node peers provides a >95% chance of
>> getting 5 uniques, with exponentially improving odds of success as you
>> connect to more peers. Connecting to a single archive node guarantees that
>> any gaps can be filled.
>>
>> A good encoder should be able to turn a block into a 5-of-256 piece set
>> in under 10 milliseconds using a single core on a standard consumer
>> desktop. This should not slow down initial blockchain download
>> substantially, though the overhead is more than a rounding error.
>>
>> *DoS Prevention:*
>>
>> A malicious node may provide garbage data instead of the actual piece.
>> Given just the garbage data and 4 other correct pieces, it is impossible
>> (best I know anyway) to tell which piece is the garbage piece.
>>
>> One option in this case would be to seek out an archival node that could
>> verify the correctness of the pieces, and identify the malicious node.
>>
>> Another option would be to have the small nodes store a cryptographic
>> checksum of each piece. Obtaining the cryptographic checksum for all 256
>> pieces would incur a nontrivial amount of hashing (post segwit, as much as
>> 100MB of extra hashing per block), and would require an additional ~4kb of
>> storage per block. The hashing overhead here may be prohibitive.
>>
>> Another solution would be to find additional pieces and brute-force
>> combinations of 5 until a working combination was discovered. Though this
>> sounds nasty, it should take less than five seconds of computation to find
>> the working combination given 5 correct pieces and 2 incorrect pieces. This
>> computation only needs to be performed once to identify the malicious peers.
>>
>> I also believe that alternative erasure coding schemes exist which
>> actually are able to identify the bad pieces given sufficient good pieces,
>> however I don't know if they have the same computational performance as the
>> best Reed-Solomon coding implementations.
>>
>> *Deployment:*
>>
>> Small nodes are completely useless unless the critical mass of 5 pieces
>> can be obtained. The first version that supports small node block downloads
>> should default everyone to an archival node (meaning indexes 0-4 are used)
>>
>> Once there are enough small-node-enabled archive nodes, the default can
>> be switched so that nodes only have a single index by default. In the first
>> few days, when there are only a few small nodes, the previously-deployed
>> archival nodes can help fill in the gaps, and the small nodes can be useful
>> for blockchain download right away.
>>
>> ----------------------------------
>>
>> This represents a non-trivial amount of code, but I believe that the
>> result would be a non-trivial increase in the percentage of users running
>> full nodes, and a healthier overall network.
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
[-- Attachment #2: Type: text/html, Size: 9997 bytes --]
next prev parent reply other threads:[~2017-04-20 15:50 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-17 6:54 [bitcoin-dev] Small Nodes: A Better Alternative to Pruned Nodes David Vorick
2017-04-17 7:11 ` Danny Thorpe
2017-04-17 7:27 ` David Vorick
2017-04-20 15:50 ` Erik Aronesty [this message]
2017-04-20 23:42 ` Aymeric Vitte
2017-04-21 13:35 ` David Kaufman
2017-04-21 15:58 ` Leandro Coutinho
2017-04-17 10:14 ` Aymeric Vitte
2017-04-19 17:30 ` David Vorick
2017-04-20 9:46 ` Tom Zander
2017-04-20 20:32 ` Andrew Poelstra
2017-04-21 8:27 ` Tom Zander
2017-04-20 11:27 ` Aymeric Vitte
2017-04-18 7:43 ` Jonas Schnelli
2017-04-18 10:50 ` Tom Zander
2017-04-18 13:07 ` Tier Nolan
2017-04-18 23:19 ` Aymeric Vitte
2017-04-19 4:28 ` udevNull
2017-04-19 13:47 ` Angel Leon
2017-04-21 20:38 ` Gregory Maxwell
2017-04-23 16:27 ` Aymeric Vitte
2017-05-03 14:03 ` Erik Aronesty
2017-05-03 19:10 ` Natanael
2017-05-03 22:45 ` Aymeric Vitte
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJowKg++8GD3gE15pdwe0Bj-L0A6MAzG0_uTSLASaRT9yVb1aQ@mail.gmail.com \
--to=erik@q32.com \
--cc=bitcoin-dev@lists.linuxfoundation.org \
--cc=danny.thorpe@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox