* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-12 19:03 ` Gregory Maxwell
@ 2015-05-12 19:24 ` gabe appleton
2015-05-12 19:38 ` Jeff Garzik
2015-05-12 22:00 ` [Bitcoin-development] " Tier Nolan
2015-05-13 5:19 ` Daniel Kraft
2 siblings, 1 reply; 19+ messages in thread
From: gabe appleton @ 2015-05-12 19:24 UTC (permalink / raw)
To: Gregory Maxwell; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 4070 bytes --]
0, 1, 3, 4, 5, 6 can be solved by looking at chunks chronologically. Ie,
give the signed (by sender) hash of the first and last block in your range.
This is less data dense than the idea above, but it might work better.
That said, this is likely a less secure way to do it. To improve upon that,
a node could request a block of random height within that range and verify
it, but that violates point 2. And the scheme in itself definitely violates
point 7.
On May 12, 2015 3:07 PM, "Gregory Maxwell" <gmaxwell@gmail.com> wrote:
> It's a little frustrating to see this just repeated without even
> paying attention to the desirable characteristics from the prior
> discussions.
>
> Summarizing from memory:
>
> (0) Block coverage should have locality; historical blocks are
> (almost) always needed in contiguous ranges. Having random peers
> with totally random blocks would be horrific for performance; as you'd
> have to hunt down a working peer and make a connection for each block
> with high probability.
>
> (1) Block storage on nodes with a fraction of the history should not
> depend on believing random peers; because listening to peers can
> easily create attacks (e.g. someone could break the network; by
> convincing nodes to become unbalanced) and not useful-- it's not like
> the blockchain is substantially different for anyone; if you're to the
> point of needing to know coverage to fill then something is wrong.
> Gaps would be handled by archive nodes, so there is no reason to
> increase vulnerability by doing anything but behaving uniformly.
>
> (2) The decision to contact a node should need O(1) communications,
> not just because of the delay of chasing around just to find who has
> someone; but because that chasing process usually makes the process
> _highly_ sybil vulnerable.
>
> (3) The expression of what blocks a node has should be compact (e.g.
> not a dense list of blocks) so it can be rumored efficiently.
>
> (4) Figuring out what block (ranges) a peer has given should be
> computationally efficient.
>
> (5) The communication about what blocks a node has should be compact.
>
> (6) The coverage created by the network should be uniform, and should
> remain uniform as the blockchain grows; ideally it you shouldn't need
> to update your state to know what blocks a peer will store in the
> future, assuming that it doesn't change the amount of data its
> planning to use. (What Tier Nolan proposes sounds like it fails this
> point)
>
> (7) Growth of the blockchain shouldn't cause much (or any) need to
> refetch old blocks.
>
> I've previously proposed schemes which come close but fail one of the
> above.
>
> (e.g. a scheme based on reservoir sampling that gives uniform
> selection of contiguous ranges, communicating only 64 bits of data to
> know what blocks a node claims to have, remaining totally uniform as
> the chain grows, without any need to refetch -- but needs O(height)
> work to figure out what blocks a peer has from the data it
> communicated.; or another scheme based on consistent hashes that has
> log(height) computation; but sometimes may result in a node needing to
> go refetch an old block range it previously didn't store-- creating
> re-balancing traffic.)
>
> So far something that meets all those criteria (and/or whatever ones
> I'm not remembering) has not been discovered; but I don't really think
> much time has been spent on it. I think its very likely possible.
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>
[-- Attachment #2: Type: text/html, Size: 4840 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-12 19:24 ` gabe appleton
@ 2015-05-12 19:38 ` Jeff Garzik
2015-05-12 19:43 ` gabe appleton
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Jeff Garzik @ 2015-05-12 19:38 UTC (permalink / raw)
To: gabe appleton; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 5183 bytes --]
One general problem is that security is weakened when an attacker can DoS a
small part of the chain by DoS'ing a small number of nodes - yet the impact
is a network-wide DoS because nobody can complete a sync.
On Tue, May 12, 2015 at 12:24 PM, gabe appleton <gappleto97@gmail.com>
wrote:
> 0, 1, 3, 4, 5, 6 can be solved by looking at chunks chronologically. Ie,
> give the signed (by sender) hash of the first and last block in your range.
> This is less data dense than the idea above, but it might work better.
>
> That said, this is likely a less secure way to do it. To improve upon
> that, a node could request a block of random height within that range and
> verify it, but that violates point 2. And the scheme in itself definitely
> violates point 7.
> On May 12, 2015 3:07 PM, "Gregory Maxwell" <gmaxwell@gmail.com> wrote:
>
>> It's a little frustrating to see this just repeated without even
>> paying attention to the desirable characteristics from the prior
>> discussions.
>>
>> Summarizing from memory:
>>
>> (0) Block coverage should have locality; historical blocks are
>> (almost) always needed in contiguous ranges. Having random peers
>> with totally random blocks would be horrific for performance; as you'd
>> have to hunt down a working peer and make a connection for each block
>> with high probability.
>>
>> (1) Block storage on nodes with a fraction of the history should not
>> depend on believing random peers; because listening to peers can
>> easily create attacks (e.g. someone could break the network; by
>> convincing nodes to become unbalanced) and not useful-- it's not like
>> the blockchain is substantially different for anyone; if you're to the
>> point of needing to know coverage to fill then something is wrong.
>> Gaps would be handled by archive nodes, so there is no reason to
>> increase vulnerability by doing anything but behaving uniformly.
>>
>> (2) The decision to contact a node should need O(1) communications,
>> not just because of the delay of chasing around just to find who has
>> someone; but because that chasing process usually makes the process
>> _highly_ sybil vulnerable.
>>
>> (3) The expression of what blocks a node has should be compact (e.g.
>> not a dense list of blocks) so it can be rumored efficiently.
>>
>> (4) Figuring out what block (ranges) a peer has given should be
>> computationally efficient.
>>
>> (5) The communication about what blocks a node has should be compact.
>>
>> (6) The coverage created by the network should be uniform, and should
>> remain uniform as the blockchain grows; ideally it you shouldn't need
>> to update your state to know what blocks a peer will store in the
>> future, assuming that it doesn't change the amount of data its
>> planning to use. (What Tier Nolan proposes sounds like it fails this
>> point)
>>
>> (7) Growth of the blockchain shouldn't cause much (or any) need to
>> refetch old blocks.
>>
>> I've previously proposed schemes which come close but fail one of the
>> above.
>>
>> (e.g. a scheme based on reservoir sampling that gives uniform
>> selection of contiguous ranges, communicating only 64 bits of data to
>> know what blocks a node claims to have, remaining totally uniform as
>> the chain grows, without any need to refetch -- but needs O(height)
>> work to figure out what blocks a peer has from the data it
>> communicated.; or another scheme based on consistent hashes that has
>> log(height) computation; but sometimes may result in a node needing to
>> go refetch an old block range it previously didn't store-- creating
>> re-balancing traffic.)
>>
>> So far something that meets all those criteria (and/or whatever ones
>> I'm not remembering) has not been discovered; but I don't really think
>> much time has been spent on it. I think its very likely possible.
>>
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Bitcoin-development mailing list
>> Bitcoin-development@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>>
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>
>
--
Jeff Garzik
Bitcoin core developer and open source evangelist
BitPay, Inc. https://bitpay.com/
[-- Attachment #2: Type: text/html, Size: 6654 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-12 19:38 ` Jeff Garzik
@ 2015-05-12 19:43 ` gabe appleton
2015-05-12 21:30 ` [Bitcoin-development] [Bulk] " gb
2015-05-12 20:02 ` [Bitcoin-development] " Gregory Maxwell
[not found] ` <CAFVoEQTdmCSRAy3u26q5oHdfvFEytZDBfQb_fs_qttK15fiRmg@mail.gmail.com>
2 siblings, 1 reply; 19+ messages in thread
From: gabe appleton @ 2015-05-12 19:43 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 5627 bytes --]
Yet this holds true in our current assumptions of the network as well: that
it will become a collection of pruned nodes with a few storage nodes.
A hybrid option makes this better, because it spreads the risk, rather than
concentrating it in full nodes.
On May 12, 2015 3:38 PM, "Jeff Garzik" <jgarzik@bitpay.com> wrote:
> One general problem is that security is weakened when an attacker can DoS
> a small part of the chain by DoS'ing a small number of nodes - yet the
> impact is a network-wide DoS because nobody can complete a sync.
>
>
> On Tue, May 12, 2015 at 12:24 PM, gabe appleton <gappleto97@gmail.com>
> wrote:
>
>> 0, 1, 3, 4, 5, 6 can be solved by looking at chunks chronologically. Ie,
>> give the signed (by sender) hash of the first and last block in your range.
>> This is less data dense than the idea above, but it might work better.
>>
>> That said, this is likely a less secure way to do it. To improve upon
>> that, a node could request a block of random height within that range and
>> verify it, but that violates point 2. And the scheme in itself definitely
>> violates point 7.
>> On May 12, 2015 3:07 PM, "Gregory Maxwell" <gmaxwell@gmail.com> wrote:
>>
>>> It's a little frustrating to see this just repeated without even
>>> paying attention to the desirable characteristics from the prior
>>> discussions.
>>>
>>> Summarizing from memory:
>>>
>>> (0) Block coverage should have locality; historical blocks are
>>> (almost) always needed in contiguous ranges. Having random peers
>>> with totally random blocks would be horrific for performance; as you'd
>>> have to hunt down a working peer and make a connection for each block
>>> with high probability.
>>>
>>> (1) Block storage on nodes with a fraction of the history should not
>>> depend on believing random peers; because listening to peers can
>>> easily create attacks (e.g. someone could break the network; by
>>> convincing nodes to become unbalanced) and not useful-- it's not like
>>> the blockchain is substantially different for anyone; if you're to the
>>> point of needing to know coverage to fill then something is wrong.
>>> Gaps would be handled by archive nodes, so there is no reason to
>>> increase vulnerability by doing anything but behaving uniformly.
>>>
>>> (2) The decision to contact a node should need O(1) communications,
>>> not just because of the delay of chasing around just to find who has
>>> someone; but because that chasing process usually makes the process
>>> _highly_ sybil vulnerable.
>>>
>>> (3) The expression of what blocks a node has should be compact (e.g.
>>> not a dense list of blocks) so it can be rumored efficiently.
>>>
>>> (4) Figuring out what block (ranges) a peer has given should be
>>> computationally efficient.
>>>
>>> (5) The communication about what blocks a node has should be compact.
>>>
>>> (6) The coverage created by the network should be uniform, and should
>>> remain uniform as the blockchain grows; ideally it you shouldn't need
>>> to update your state to know what blocks a peer will store in the
>>> future, assuming that it doesn't change the amount of data its
>>> planning to use. (What Tier Nolan proposes sounds like it fails this
>>> point)
>>>
>>> (7) Growth of the blockchain shouldn't cause much (or any) need to
>>> refetch old blocks.
>>>
>>> I've previously proposed schemes which come close but fail one of the
>>> above.
>>>
>>> (e.g. a scheme based on reservoir sampling that gives uniform
>>> selection of contiguous ranges, communicating only 64 bits of data to
>>> know what blocks a node claims to have, remaining totally uniform as
>>> the chain grows, without any need to refetch -- but needs O(height)
>>> work to figure out what blocks a peer has from the data it
>>> communicated.; or another scheme based on consistent hashes that has
>>> log(height) computation; but sometimes may result in a node needing to
>>> go refetch an old block range it previously didn't store-- creating
>>> re-balancing traffic.)
>>>
>>> So far something that meets all those criteria (and/or whatever ones
>>> I'm not remembering) has not been discovered; but I don't really think
>>> much time has been spent on it. I think its very likely possible.
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> One dashboard for servers and applications across Physical-Virtual-Cloud
>>> Widest out-of-the-box monitoring support with 50+ applications
>>> Performance metrics, stats and reports that give you Actionable Insights
>>> Deep dive visibility with transaction tracing using APM Insight.
>>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>>> _______________________________________________
>>> Bitcoin-development mailing list
>>> Bitcoin-development@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> One dashboard for servers and applications across Physical-Virtual-Cloud
>> Widest out-of-the-box monitoring support with 50+ applications
>> Performance metrics, stats and reports that give you Actionable Insights
>> Deep dive visibility with transaction tracing using APM Insight.
>> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
>> _______________________________________________
>> Bitcoin-development mailing list
>> Bitcoin-development@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>>
>>
>
>
> --
> Jeff Garzik
> Bitcoin core developer and open source evangelist
> BitPay, Inc. https://bitpay.com/
>
[-- Attachment #2: Type: text/html, Size: 7205 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bitcoin-development] [Bulk] Re: Proposed additional options for pruned nodes
2015-05-12 19:43 ` gabe appleton
@ 2015-05-12 21:30 ` gb
0 siblings, 0 replies; 19+ messages in thread
From: gb @ 2015-05-12 21:30 UTC (permalink / raw)
To: gabe appleton; +Cc: Bitcoin Dev
This seems like a good place to add in an idea I had about
partially-connected nodes that are able to throttle bandwidth demands.
While we will be having partial-blockchain nodes with a spectrum of
storage options the requirement to be connected is somewhat binary, I
think many users manually throttle by turning nodes on/off already with
a minimum to just keep the chain up to date. A throttling option would
leverage on bitcoin's asychronous design to reduce bandwidth demands for
weaker nodes.
So throttling to allow for a spectrum of bandwidth connectivity:
1) an option for the user -throttle=XXX that would allow the user to
specify a desirable total bandwidth XXX in Gbytes/day the bitcoin client
can use.
2) the client reduces the number of continuous connections, transaction
or block relaying to achieve the desired throttling rate
3) it could do this by being partially connected throughout the duty
cycle or cycling the node on/off for a percentage of a 24(?) hr period
4) have an auto setting where some smart traffic management 'just takes
care of it' and manual settings that can be user configured
5) reduces minimum requirement for any 24(?) hr period it has received a
full copy of all blocks to remain fully-validating
Not sure if anyone has bought such an idea forward or if there are
obvious holes, so pre-emptive apologies for time-wasting if so.
On Tue, 2015-05-12 at 15:43 -0400, gabe appleton wrote:
> Yet this holds true in our current assumptions of the network as well:
> that it will become a collection of pruned nodes with a few storage
> nodes.
>
> A hybrid option makes this better, because it spreads the risk, rather
> than concentrating it in full nodes.
>
> On May 12, 2015 3:38 PM, "Jeff Garzik" <jgarzik@bitpay.com> wrote:
> One general problem is that security is weakened when an
> attacker can DoS a small part of the chain by DoS'ing a small
> number of nodes - yet the impact is a network-wide DoS because
> nobody can complete a sync.
>
>
>
> On Tue, May 12, 2015 at 12:24 PM, gabe appleton
> <gappleto97@gmail.com> wrote:
> 0, 1, 3, 4, 5, 6 can be solved by looking at chunks
> chronologically. Ie, give the signed (by sender) hash
> of the first and last block in your range. This is
> less data dense than the idea above, but it might work
> better.
>
> That said, this is likely a less secure way to do it.
> To improve upon that, a node could request a block of
> random height within that range and verify it, but
> that violates point 2. And the scheme in itself
> definitely violates point 7.
>
> On May 12, 2015 3:07 PM, "Gregory Maxwell"
> <gmaxwell@gmail.com> wrote:
> It's a little frustrating to see this just
> repeated without even
> paying attention to the desirable
> characteristics from the prior
> discussions.
>
> Summarizing from memory:
>
> (0) Block coverage should have locality;
> historical blocks are
> (almost) always needed in contiguous ranges.
> Having random peers
> with totally random blocks would be horrific
> for performance; as you'd
> have to hunt down a working peer and make a
> connection for each block
> with high probability.
>
> (1) Block storage on nodes with a fraction of
> the history should not
> depend on believing random peers; because
> listening to peers can
> easily create attacks (e.g. someone could
> break the network; by
> convincing nodes to become unbalanced) and not
> useful-- it's not like
> the blockchain is substantially different for
> anyone; if you're to the
> point of needing to know coverage to fill then
> something is wrong.
> Gaps would be handled by archive nodes, so
> there is no reason to
> increase vulnerability by doing anything but
> behaving uniformly.
>
> (2) The decision to contact a node should need
> O(1) communications,
> not just because of the delay of chasing
> around just to find who has
> someone; but because that chasing process
> usually makes the process
> _highly_ sybil vulnerable.
>
> (3) The expression of what blocks a node has
> should be compact (e.g.
> not a dense list of blocks) so it can be
> rumored efficiently.
>
> (4) Figuring out what block (ranges) a peer
> has given should be
> computationally efficient.
>
> (5) The communication about what blocks a node
> has should be compact.
>
> (6) The coverage created by the network should
> be uniform, and should
> remain uniform as the blockchain grows;
> ideally it you shouldn't need
> to update your state to know what blocks a
> peer will store in the
> future, assuming that it doesn't change the
> amount of data its
> planning to use. (What Tier Nolan proposes
> sounds like it fails this
> point)
>
> (7) Growth of the blockchain shouldn't cause
> much (or any) need to
> refetch old blocks.
>
> I've previously proposed schemes which come
> close but fail one of the above.
>
> (e.g. a scheme based on reservoir sampling
> that gives uniform
> selection of contiguous ranges, communicating
> only 64 bits of data to
> know what blocks a node claims to have,
> remaining totally uniform as
> the chain grows, without any need to refetch
> -- but needs O(height)
> work to figure out what blocks a peer has from
> the data it
> communicated.; or another scheme based on
> consistent hashes that has
> log(height) computation; but sometimes may
> result in a node needing to
> go refetch an old block range it previously
> didn't store-- creating
> re-balancing traffic.)
>
> So far something that meets all those criteria
> (and/or whatever ones
> I'm not remembering) has not been discovered;
> but I don't really think
> much time has been spent on it. I think its
> very likely possible.
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications
> across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with
> 50+ applications
> Performance metrics, stats and reports that
> give you Actionable Insights
> Deep dive visibility with transaction tracing
> using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across
> Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+
> applications
> Performance metrics, stats and reports that give you
> Actionable Insights
> Deep dive visibility with transaction tracing using
> APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>
>
>
>
>
> --
> Jeff Garzik
> Bitcoin core developer and open source evangelist
> BitPay, Inc. https://bitpay.com/
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-12 19:38 ` Jeff Garzik
2015-05-12 19:43 ` gabe appleton
@ 2015-05-12 20:02 ` Gregory Maxwell
2015-05-12 20:10 ` Jeff Garzik
[not found] ` <CAFVoEQTdmCSRAy3u26q5oHdfvFEytZDBfQb_fs_qttK15fiRmg@mail.gmail.com>
2 siblings, 1 reply; 19+ messages in thread
From: Gregory Maxwell @ 2015-05-12 20:02 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Bitcoin Dev
On Tue, May 12, 2015 at 7:38 PM, Jeff Garzik <jgarzik@bitpay.com> wrote:
> One general problem is that security is weakened when an attacker can DoS a
> small part of the chain by DoS'ing a small number of nodes - yet the impact
> is a network-wide DoS because nobody can complete a sync.
It might be more interesting to think of that attack as a bandwidth
exhaustion DOS attack on the archive nodes... if you can't get a copy
without them, thats where you'll go.
So the question arises: does the option make some nodes that would
have been archive not be? Probably some-- but would it do so much that
it would offset the gain of additional copies of the data when those
attacks are not going no. I suspect not.
It's also useful to give people incremental ways to participate even
when they can't swollow the whole pill; or choose to provide the
resource thats cheap for them to provide. In particular, if there is
only two kinds of full nodes-- archive and pruned; then the archive
nodes take both a huge disk and bandwidth cost; where as if there are
fractional then archives take low(er) bandwidth unless the fractionals
get DOS attacked.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-12 20:02 ` [Bitcoin-development] " Gregory Maxwell
@ 2015-05-12 20:10 ` Jeff Garzik
2015-05-12 20:41 ` gabe appleton
2015-05-12 20:47 ` Gregory Maxwell
0 siblings, 2 replies; 19+ messages in thread
From: Jeff Garzik @ 2015-05-12 20:10 UTC (permalink / raw)
To: Gregory Maxwell; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 1944 bytes --]
True. Part of the issue rests on the block sync horizon/cliff. There is a
value X which is the average number of blocks the 90th percentile of nodes
need in order to sync. It is sufficient for the [semi-]pruned nodes to
keep X blocks, after which nodes must fall back to archive nodes for older
data.
There is simply far, far more demand for recent blocks, and the demand for
old blocks very rapidly falls off.
There was even a more radical suggestion years ago - refuse to sync if too
old (>2 weeks?), and force the user to download ancient data via torrent.
On Tue, May 12, 2015 at 1:02 PM, Gregory Maxwell <gmaxwell@gmail.com> wrote:
> On Tue, May 12, 2015 at 7:38 PM, Jeff Garzik <jgarzik@bitpay.com> wrote:
> > One general problem is that security is weakened when an attacker can
> DoS a
> > small part of the chain by DoS'ing a small number of nodes - yet the
> impact
> > is a network-wide DoS because nobody can complete a sync.
>
> It might be more interesting to think of that attack as a bandwidth
> exhaustion DOS attack on the archive nodes... if you can't get a copy
> without them, thats where you'll go.
>
> So the question arises: does the option make some nodes that would
> have been archive not be? Probably some-- but would it do so much that
> it would offset the gain of additional copies of the data when those
> attacks are not going no. I suspect not.
>
> It's also useful to give people incremental ways to participate even
> when they can't swollow the whole pill; or choose to provide the
> resource thats cheap for them to provide. In particular, if there is
> only two kinds of full nodes-- archive and pruned; then the archive
> nodes take both a huge disk and bandwidth cost; where as if there are
> fractional then archives take low(er) bandwidth unless the fractionals
> get DOS attacked.
>
--
Jeff Garzik
Bitcoin core developer and open source evangelist
BitPay, Inc. https://bitpay.com/
[-- Attachment #2: Type: text/html, Size: 2607 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-12 20:10 ` Jeff Garzik
@ 2015-05-12 20:41 ` gabe appleton
2015-05-12 20:47 ` Gregory Maxwell
1 sibling, 0 replies; 19+ messages in thread
From: gabe appleton @ 2015-05-12 20:41 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 2355 bytes --]
I suppose this begs two questions:
1) why not have a partial archive store the most recent X% of the
blockchain by default?
2) why not include some sort of torrent in QT, to mitigate this risk? I
don't think this is necessarily a good idea, but I'd like to hear the
reasoning.
On May 12, 2015 4:11 PM, "Jeff Garzik" <jgarzik@bitpay.com> wrote:
> True. Part of the issue rests on the block sync horizon/cliff. There is
> a value X which is the average number of blocks the 90th percentile of
> nodes need in order to sync. It is sufficient for the [semi-]pruned nodes
> to keep X blocks, after which nodes must fall back to archive nodes for
> older data.
>
> There is simply far, far more demand for recent blocks, and the demand for
> old blocks very rapidly falls off.
>
> There was even a more radical suggestion years ago - refuse to sync if too
> old (>2 weeks?), and force the user to download ancient data via torrent.
>
>
>
> On Tue, May 12, 2015 at 1:02 PM, Gregory Maxwell <gmaxwell@gmail.com>
> wrote:
>
>> On Tue, May 12, 2015 at 7:38 PM, Jeff Garzik <jgarzik@bitpay.com> wrote:
>> > One general problem is that security is weakened when an attacker can
>> DoS a
>> > small part of the chain by DoS'ing a small number of nodes - yet the
>> impact
>> > is a network-wide DoS because nobody can complete a sync.
>>
>> It might be more interesting to think of that attack as a bandwidth
>> exhaustion DOS attack on the archive nodes... if you can't get a copy
>> without them, thats where you'll go.
>>
>> So the question arises: does the option make some nodes that would
>> have been archive not be? Probably some-- but would it do so much that
>> it would offset the gain of additional copies of the data when those
>> attacks are not going no. I suspect not.
>>
>> It's also useful to give people incremental ways to participate even
>> when they can't swollow the whole pill; or choose to provide the
>> resource thats cheap for them to provide. In particular, if there is
>> only two kinds of full nodes-- archive and pruned; then the archive
>> nodes take both a huge disk and bandwidth cost; where as if there are
>> fractional then archives take low(er) bandwidth unless the fractionals
>> get DOS attacked.
>>
>
>
>
> --
> Jeff Garzik
> Bitcoin core developer and open source evangelist
> BitPay, Inc. https://bitpay.com/
>
[-- Attachment #2: Type: text/html, Size: 3223 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-12 20:10 ` Jeff Garzik
2015-05-12 20:41 ` gabe appleton
@ 2015-05-12 20:47 ` Gregory Maxwell
1 sibling, 0 replies; 19+ messages in thread
From: Gregory Maxwell @ 2015-05-12 20:47 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Bitcoin Dev
On Tue, May 12, 2015 at 8:10 PM, Jeff Garzik <jgarzik@bitpay.com> wrote:
> True. Part of the issue rests on the block sync horizon/cliff. There is a
> value X which is the average number of blocks the 90th percentile of nodes
> need in order to sync. It is sufficient for the [semi-]pruned nodes to keep
> X blocks, after which nodes must fall back to archive nodes for older data.
Prior discussion had things like "the definition of pruned means you
have and will serve at least the last 288 from your tip" (which is
what I put in the pruned service bip text); and another flag for "I
have at least the last 2016". (2016 should be reevaluated-- it was
just a round number near where sipa's old data showed the fetch
probability flatlined.
But that data was old, but what it showed that the probability of a
block being fetched vs depth looked like a exponential drop-off (I
think with a 50% at 3-ish days); plus a constant low probability.
Which is probably what we should have expected.
> There was even a more radical suggestion years ago - refuse to sync if too
> old (>2 weeks?), and force the user to download ancient data via torrent.
I'm not fond of this; it makes the system dependent on centralized
services (e.g. trackers and sources of torrents). A torrent also
cannot very efficiently handle fractional copies; cannot efficiently
grow over time. Bitcoin should be complete-- plus, many nodes already
have the data.
^ permalink raw reply [flat|nested] 19+ messages in thread
[parent not found: <CAFVoEQTdmCSRAy3u26q5oHdfvFEytZDBfQb_fs_qttK15fiRmg@mail.gmail.com>]
* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-12 19:03 ` Gregory Maxwell
2015-05-12 19:24 ` gabe appleton
@ 2015-05-12 22:00 ` Tier Nolan
2015-05-12 22:09 ` gabe appleton
2015-05-13 5:19 ` Daniel Kraft
2 siblings, 1 reply; 19+ messages in thread
From: Tier Nolan @ 2015-05-12 22:00 UTC (permalink / raw)
Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 3512 bytes --]
On Tue, May 12, 2015 at 8:03 PM, Gregory Maxwell <gmaxwell@gmail.com> wrote:
>
> (0) Block coverage should have locality; historical blocks are
> (almost) always needed in contiguous ranges. Having random peers
> with totally random blocks would be horrific for performance; as you'd
> have to hunt down a working peer and make a connection for each block
> with high probability.
>
> (1) Block storage on nodes with a fraction of the history should not
> depend on believing random peers; because listening to peers can
> easily create attacks (e.g. someone could break the network; by
> convincing nodes to become unbalanced) and not useful-- it's not like
> the blockchain is substantially different for anyone; if you're to the
> point of needing to know coverage to fill then something is wrong.
> Gaps would be handled by archive nodes, so there is no reason to
> increase vulnerability by doing anything but behaving uniformly.
>
> (2) The decision to contact a node should need O(1) communications,
> not just because of the delay of chasing around just to find who has
> someone; but because that chasing process usually makes the process
> _highly_ sybil vulnerable.
>
> (3) The expression of what blocks a node has should be compact (e.g.
> not a dense list of blocks) so it can be rumored efficiently.
>
> (4) Figuring out what block (ranges) a peer has given should be
> computationally efficient.
>
> (5) The communication about what blocks a node has should be compact.
>
> (6) The coverage created by the network should be uniform, and should
> remain uniform as the blockchain grows; ideally it you shouldn't need
> to update your state to know what blocks a peer will store in the
> future, assuming that it doesn't change the amount of data its
> planning to use. (What Tier Nolan proposes sounds like it fails this
> point)
>
> (7) Growth of the blockchain shouldn't cause much (or any) need to
> refetch old blocks.
>
M = 1,000,000
N = number of "starts"
S(0) = hash(seed) mod M
...
S(n) = hash(S(n-1)) mod M
This generates a sequence of start points. If the start point is less than
the block height, then it counts as a hit.
The node stores the 50MB of data starting at the block at height S(n).
As the blockchain increases in size, new starts will be less than the block
height. This means some other runs would be deleted.
A weakness is that it is random with regards to block heights. Tiny blocks
have the same priority as larger blocks.
0) Blocks are local, in 50MB runs
1) Agreed, nodes should download headers-first (or some other compact way
of finding the highest POW chain)
2) M could be fixed, N and the seed are all that is required. The seed
doesn't have to be that large. If 1% of the blockchain is stored, then 16
bits should be sufficient so that every block is covered by seeds.
3) N is likely to be less than 2 bytes and the seed can be 2 bytes
4) A 1% cover of 50GB of blockchain would have 10 starts @ 50MB per run.
That is 10 hashes. They don't even necessarily need to be crypt hashes
5) Isn't this the same as 3?
6) Every block has the same odds of being included. There inherently needs
to be an update when a node deletes some info due to exceeding its cap. N
can be dropped one run at a time.
7) When new starts drop below the tip height, N can be decremented and that
one run is deleted.
There would need to be a special rule to ensure the low height blocks are
covered. Nodes should keep the first 50MB of blocks with some probability
(10%?)
[-- Attachment #2: Type: text/html, Size: 4609 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-12 22:00 ` [Bitcoin-development] " Tier Nolan
@ 2015-05-12 22:09 ` gabe appleton
0 siblings, 0 replies; 19+ messages in thread
From: gabe appleton @ 2015-05-12 22:09 UTC (permalink / raw)
To: Tier Nolan; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 4693 bytes --]
This is exactly the sort of solution I was hoping for. It seems this is the
minimal modification to make it work, and, if someone was willing to work
with me, I would love to help implement this.
My only concern would be if the - - max-size flag is not included than this
delivers significantly less benefit to the end user. Still a good chunk,
but possibly not enough.
On May 12, 2015 6:03 PM, "Tier Nolan" <tier.nolan@gmail.com> wrote:
>
>
> On Tue, May 12, 2015 at 8:03 PM, Gregory Maxwell <gmaxwell@gmail.com>
> wrote:
>
>>
>> (0) Block coverage should have locality; historical blocks are
>> (almost) always needed in contiguous ranges. Having random peers
>> with totally random blocks would be horrific for performance; as you'd
>> have to hunt down a working peer and make a connection for each block
>> with high probability.
>>
>> (1) Block storage on nodes with a fraction of the history should not
>> depend on believing random peers; because listening to peers can
>> easily create attacks (e.g. someone could break the network; by
>> convincing nodes to become unbalanced) and not useful-- it's not like
>> the blockchain is substantially different for anyone; if you're to the
>> point of needing to know coverage to fill then something is wrong.
>> Gaps would be handled by archive nodes, so there is no reason to
>> increase vulnerability by doing anything but behaving uniformly.
>>
>> (2) The decision to contact a node should need O(1) communications,
>> not just because of the delay of chasing around just to find who has
>> someone; but because that chasing process usually makes the process
>> _highly_ sybil vulnerable.
>>
>> (3) The expression of what blocks a node has should be compact (e.g.
>> not a dense list of blocks) so it can be rumored efficiently.
>>
>> (4) Figuring out what block (ranges) a peer has given should be
>> computationally efficient.
>>
>> (5) The communication about what blocks a node has should be compact.
>>
>> (6) The coverage created by the network should be uniform, and should
>> remain uniform as the blockchain grows; ideally it you shouldn't need
>> to update your state to know what blocks a peer will store in the
>> future, assuming that it doesn't change the amount of data its
>> planning to use. (What Tier Nolan proposes sounds like it fails this
>> point)
>>
>> (7) Growth of the blockchain shouldn't cause much (or any) need to
>> refetch old blocks.
>>
>
> M = 1,000,000
> N = number of "starts"
>
> S(0) = hash(seed) mod M
> ...
> S(n) = hash(S(n-1)) mod M
>
> This generates a sequence of start points. If the start point is less
> than the block height, then it counts as a hit.
>
> The node stores the 50MB of data starting at the block at height S(n).
>
> As the blockchain increases in size, new starts will be less than the
> block height. This means some other runs would be deleted.
>
> A weakness is that it is random with regards to block heights. Tiny
> blocks have the same priority as larger blocks.
>
> 0) Blocks are local, in 50MB runs
> 1) Agreed, nodes should download headers-first (or some other compact way
> of finding the highest POW chain)
> 2) M could be fixed, N and the seed are all that is required. The seed
> doesn't have to be that large. If 1% of the blockchain is stored, then 16
> bits should be sufficient so that every block is covered by seeds.
> 3) N is likely to be less than 2 bytes and the seed can be 2 bytes
> 4) A 1% cover of 50GB of blockchain would have 10 starts @ 50MB per run.
> That is 10 hashes. They don't even necessarily need to be crypt hashes
> 5) Isn't this the same as 3?
> 6) Every block has the same odds of being included. There inherently
> needs to be an update when a node deletes some info due to exceeding its
> cap. N can be dropped one run at a time.
> 7) When new starts drop below the tip height, N can be decremented and
> that one run is deleted.
>
> There would need to be a special rule to ensure the low height blocks are
> covered. Nodes should keep the first 50MB of blocks with some probability
> (10%?)
>
>
> ------------------------------------------------------------------------------
> One dashboard for servers and applications across Physical-Virtual-Cloud
> Widest out-of-the-box monitoring support with 50+ applications
> Performance metrics, stats and reports that give you Actionable Insights
> Deep dive visibility with transaction tracing using APM Insight.
> http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>
>
[-- Attachment #2: Type: text/html, Size: 6210 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-12 19:03 ` Gregory Maxwell
2015-05-12 19:24 ` gabe appleton
2015-05-12 22:00 ` [Bitcoin-development] " Tier Nolan
@ 2015-05-13 5:19 ` Daniel Kraft
2015-05-13 9:34 ` Tier Nolan
2 siblings, 1 reply; 19+ messages in thread
From: Daniel Kraft @ 2015-05-13 5:19 UTC (permalink / raw)
To: bitcoin-development
[-- Attachment #1: Type: text/plain, Size: 5131 bytes --]
Hi all!
On 2015-05-12 21:03, Gregory Maxwell wrote:
> Summarizing from memory:
In the context of this discussion, let me also restate an idea I've
proposed in Bitcointalk for this. It is probably not perfect and could
surely be adapted (I'm interested in that), but I think it meets
most/all of the criteria stated below. It is similar to the idea with
"start points", but gives O(log height) instead of O(height) for
determining which blocks a node has.
Let me for simplicity assume that the node wants to store 50% of all
blocks. It is straight-forward to extend the scheme so that this is
configurable:
1) Create some kind of "seed" that can be compact and will be sent to
other peers to define which blocks the node has. Use it to initialise a
PRNG of some sort.
2) Divide the range of all blocks into intervals with exponentially
growing size. I. e., something like this:
1, 1, 2, 2, 4, 4, 8, 8, 16, 16, ...
With this, only O(log height) intervals are necessary to cover height
blocks.
3) Using the PRNG, *one* of the two intervals of each length is
selected. The node stores these blocks and discards the others.
(Possibly keeping the last 200 or 2,016 or whatever blocks additionally.)
> (0) Block coverage should have locality; historical blocks are
> (almost) always needed in contiguous ranges. Having random peers
> with totally random blocks would be horrific for performance; as you'd
> have to hunt down a working peer and make a connection for each block
> with high probability.
You get contiguous block ranges (with at most O(log height) "breaks").
Also ranges of newer blocks are longer, which may be an advantage if
those blocks are needed more often.
> (1) Block storage on nodes with a fraction of the history should not
> depend on believing random peers; because listening to peers can
> easily create attacks (e.g. someone could break the network; by
> convincing nodes to become unbalanced) and not useful-- it's not like
> the blockchain is substantially different for anyone; if you're to the
> point of needing to know coverage to fill then something is wrong.
> Gaps would be handled by archive nodes, so there is no reason to
> increase vulnerability by doing anything but behaving uniformly.
With my proposal, each node determines randomly and on its own which
blocks to store. No believing anyone.
> (2) The decision to contact a node should need O(1) communications,
> not just because of the delay of chasing around just to find who has
> someone; but because that chasing process usually makes the process
> _highly_ sybil vulnerable.
Not exactly sure what you mean by that, but I think that's fulfilled.
You can (locally) compute in O(log height) from a node's seed whether or
not it has the blocks you need. This needs only communication about the
node's seed.
> (3) The expression of what blocks a node has should be compact (e.g.
> not a dense list of blocks) so it can be rumored efficiently.
See above.
> (4) Figuring out what block (ranges) a peer has given should be
> computationally efficient.
O(log height). Not O(1), but that's probably not a big issue.
> (5) The communication about what blocks a node has should be compact.
See above.
> (6) The coverage created by the network should be uniform, and should
> remain uniform as the blockchain grows; ideally it you shouldn't need
> to update your state to know what blocks a peer will store in the
> future, assuming that it doesn't change the amount of data its
> planning to use. (What Tier Nolan proposes sounds like it fails this
> point)
Coverage will be uniform if the seed is created randomly and the PRNG
has good properties. No need to update the seed if the other node's
fraction is unchanged. (Not sure if you suggest for nodes to define a
"fraction" or rather an "absolute size".)
> (7) Growth of the blockchain shouldn't cause much (or any) need to
> refetch old blocks.
No need to do that with the scheme.
What do you think about this idea? Some random thoughts from myself:
*) I need to formulate it in a more general way so that the fraction can
be arbitrary and not just 50%. This should be easy to do, and I can do
it if there's interest.
*) It is O(log height) and not O(1), but that should not be too
different for the heights that are relevant.
*) Maybe it would be better / easier to not use the PRNG at all; just
decide to *always* use the first or the second interval with a given
size. Not sure about that.
*) With the proposed scheme, the node's actual fraction of stored blocks
will vary between 1/2 and 2/3 (if I got the mathematics right, it is
still early) as the blocks come in. Not sure if that's a problem. I
can do a precise analysis of this property for an extended scheme if you
are interested in it.
Yours,
Daniel
--
http://www.domob.eu/
OpenPGP: 1142 850E 6DFF 65BA 63D6 88A8 B249 2AC4 A733 0737
Namecoin: id/domob -> https://nameid.org/?name=domob
--
Done: Arc-Bar-Cav-Hea-Kni-Ran-Rog-Sam-Tou-Val-Wiz
To go: Mon-Pri
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bitcoin-development] Proposed additional options for pruned nodes
2015-05-13 5:19 ` Daniel Kraft
@ 2015-05-13 9:34 ` Tier Nolan
0 siblings, 0 replies; 19+ messages in thread
From: Tier Nolan @ 2015-05-13 9:34 UTC (permalink / raw)
Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 2308 bytes --]
On Wed, May 13, 2015 at 6:19 AM, Daniel Kraft <d@domob.eu> wrote:
> 2) Divide the range of all blocks into intervals with exponentially
> growing size. I. e., something like this:
>
> 1, 1, 2, 2, 4, 4, 8, 8, 16, 16, ...
>
Interesting. This can be combined with the system I suggested.
A node broadcasts 3 pieces of information
Seed (16 bits): This is the seed
M_bits_lsb (1 bit): Used to indicate M during a transition
N (7 bits): This is the count of the last range held (or partially held)
M = 1 << M_bits
M should be set to the lowest power of 2 greater than double the block
chain height
That gives M = 1 million at the moment. During changing M, some nodes will
be using the higher M and others will use the lower M.
The M_bits_lsb field allows those to be distinguished.
As the block height approaches 512k, nodes can begin to upgrade. For a
period around block 512k, some nodes could use M = 1 million and others
could use M = 2 million.
Assuming M is around 3 times higher than the block height, then the odds of
a start being less than the block height is around 35%. If they runs by
25% each step, then that is approx a double for each hit.
Size(n) = ((4 + (n & 0x3)) << (n >> 2)) * 2.5MB
This gives an exponential increase, but groups of 4 are linearly
interpolated.
*Size(0) = 10 MB*
Size(1) = 12.5MB
Size(2) = 15 MB
Size(3) = 17.5MB
Size(4) = 20MB
*Size(5) = 25MB*
Size(6) = 30MB
Size(7) = 35MB
*Size(8) = 40MB*
Start(n) = Hash(seed + n) mod M
A node should store as much of its last start as possible. Assuming start
0, 5, and 8 were "hits" but the node had a max size of 60MB. It can store
0 and 5 and have 25MB left. That isn't enough to store all of run 8, but
it should store 25MB of the blocks in run 8 anyway.
Size(255) = pow(2, 31) * 17.5MB = 35,840 TB
Decreasing N only causes previously accepted runs to be invalidated.
When a node approaches a transition point for N, it would select a block
height within 25,000 of the transition point. Once it reaches that block,
it will begin downloading the new runs that it needs. When updating, it
can set N to zero. This spreads out the upgrade (over around a year), with
only a small number of nodes upgrading at any time.
New nodes should use the higher M, if near a transition point (say within
100,000).
[-- Attachment #2: Type: text/html, Size: 3186 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread