Re: [Bitcoin-development] Segmented Block Relaying BIP draft.

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
@ 2012-09-11 19:07 Matthew Mitchell
  2012-09-11 19:42 ` Gregory Maxwell
  0 siblings, 1 reply; 18+ messages in thread
From: Matthew Mitchell @ 2012-09-11 19:07 UTC (permalink / raw)
  To: bitcoin-development

For some reason sourceforge is not sending me updates anymore but I can see the replies online…

There could be a slightly more simple protocol which gives all the transactions hashes and nodes can then download the transactions separately. However there are two problems:

1. Downloading all the transactions individually might be inefficient. My proposal will allow nodes to request multiple transactions at once.
2. Why not add a few additional components to the protocol to allow requests for any level of the merkle tree? It's not very complicated at all and protects against the future.

Sure, analysis needs to be done to see at what point the proposal would give benefit and I will hopefully get around to doing some measurements of peer behaviour to aid with this.

I think it's a good idea to think ahead rather than only do what is beneficial for the network currently. The block sizes at the moment are about 0.1MB but what if the bitcoin demand starts pushing that into megabytes? And yes the ~0.95MB limit needs to be changed in order for bitcoin to grow that far. Why would the limit not be lifted? How will bitcoin demand be satisfied other than having large fees to deter transactions, hoping the fees are large enough to balance the demand with the block size limits to prevent many transactions being unconfirmed and annoying users? That limit has got to go eventually. And then it could be that block sizes do become large enough to worry about the performance in relaying.

Best not to leave this to the last minute, so at the very least I think it's good to talk about this.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-11 19:07 [Bitcoin-development] Segmented Block Relaying BIP draft Matthew Mitchell
@ 2012-09-11 19:42 ` Gregory Maxwell
  2012-09-11 21:48   ` Matthew Mitchell
  0 siblings, 1 reply; 18+ messages in thread
From: Gregory Maxwell @ 2012-09-11 19:42 UTC (permalink / raw)
  To: Matthew Mitchell; +Cc: bitcoin-development

On Tue, Sep 11, 2012 at 3:07 PM, Matthew Mitchell
<matthewmitchell@godofgod.co.uk> wrote:
> For some reason sourceforge is not sending me updates anymore but I can see the replies online…
>
> There could be a slightly more simple protocol which gives all the transactions hashes and nodes can then download the transactions separately. However there are two problems:
>
> 1. Downloading all the transactions individually might be inefficient. My proposal will allow nodes to request multiple transactions at once.

Someone can do that just by pipelining the one at a time requests.
How much bandwidth do you think you could save over that?

> 2. Why not add a few additional components to the protocol to allow requests for any level of the merkle tree? It's not very complicated at all and protects against the future.

I don't see what value this provides.  For protecting against the
future you might as well suggest uploading x86 code which gets
executed to select transactions. "Protects against the future".  Can
you clarify some more about exactly how you think it would help?

It's sometimes desirable to be more general rather than more special
case when it's costless... but having couple extra p2p protocol
messages to implement, test for interop, guard against vulnerability,
etc. isn't costless... and should be justified with concrete benefits.

it's not clear to me how your proposal is really all that useful for
very large blocks: I looks like it would lot of bytes sending
redundant tree data.

>The block sizes at the moment are about 0.1MB but what if the bitcoin demand starts pushing that into megabytes?

And what if? _Bitcoin_ blocksizes can't be any larger.  Some future
incompatible system? well perhaps. But we're working on the protocol
for bitcoin now.

> And yes the ~0.95MB limit needs to be changed in order for bitcoin to grow that far. Why would the limit not be lifted? How will bitcoin demand be satisfied other than having large fees to deter transactions, hoping the fees are large enough to balance the demand with the block size limits to prevent many transactions being unconfirmed and annoying users? That limit has got to go eventually. And then it could be that block sizes do become large enough to worry about the performance in relaying.

The finite size— and ultimately the contention for space it causes— is
the only thing will creates non-trivial fees. Without the fees there
is no honest economic motivation to mine with adequate computing power
to provide security (lots of dishonest motivations— e.g. applying
control over the currency exist), you'd just have a race to the
bottom, given unbounded block sizes it is always rational for
decentralized to include any transaction with a fee even if it is very
small— otherwise the next rational solver is just going to include it.

Bitcoin gets its value through scarcity. There are two kinds of
scarcity that are economically important, scarcity of the coins— there
will never be more than 21 million— and scarcity of the block space
which, as the protocol is defined and enforced by every node can not
be more than 1MB. The latter scarcity is what makes the security model
economically sane.

Fortunately, its perfectly possible to make transactions denominated
in bitcoin outside of the blockchain, and in a secure and distributed
manner that respects the principles that make bitcoin attractive, but
with information hiding that improves privacy, transaction speed, and
scalability. See, e.g. the good work being done by Open transactions
to create distributed cryptographic banks.  So blockchain scarcity
itself doesn't prevent Bitcoin from being a one world currency
(something which isn't at all sane no matter how big you make the
blocks if you don't allow for other modes of transaction processing—
who the heck wants to possibly wait an hour to get a 1 confirm
sodapop??).

From the beginning it was obvious that confirmations would eventually
be slower (or expensive to make merely slow; Bitcoin is incapable of
reliable fast confirmations)— thats the nature the stochastic
consensus and the fee based security support.  You could instead
imagine a future where bitcoin's security came by collusion by major
financial cartels and governments, and where fees aren't important....
 But I reject that future, it's a perfectly viable one, but why bother
with Bitcoins in the first place? To make some early adopters a little
bit of money starting off the next big centrally controlled fiat?
Boring.

I can't say for sure if the 1MB limit will stay exactly as is forever,
as I expect the economics work with any limit out of a fairly broad
range that is low enough to both make the space seriously scarce and
low enough that 'inexpensive' (e.g. privately owned) hardware can
continue to audit it to preserve the decentralized security,  and the
economic importance of the size limit is more subtle than the
inflation resistance... but I know that changing it is precisely as
technically difficult as changing the 21 million limit: all Bitcoin
node software must be replaced with incompatible software, and I
believe it would be just as economically risky— if not more so— if
done wrong, as at least inflation would have a easily understood
direct dillution effect while inadequate security would potentially
make all Bitcoin worthless.  As such I don't think it's even worth
discussing until there is an urgent demand to clarify the tradeoffs...

Should the block size ever be increased the message format used for
relaying the larger blocks will be the smallest of the issues being
discussed.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-11 19:42 ` Gregory Maxwell
@ 2012-09-11 21:48   ` Matthew Mitchell
  2012-09-11 23:22     ` Gregory Maxwell
  0 siblings, 1 reply; 18+ messages in thread
From: Matthew Mitchell @ 2012-09-11 21:48 UTC (permalink / raw)
  To: Gregory Maxwell, bitcoin-development

On 11 Sep 2012, at 20:42, Gregory Maxwell <gmaxwell@gmail.com> wrote:

> Someone can do that just by pipelining the one at a time requests.
> How much bandwidth do you think you could save over that?

You wouldn't need to pipeline the requests, just place more than one inventory vector in get data, right? Well my messages would save the space of those inventory vectors. Instead of needing 36 byte inventory vectors for each transaction and a var int, you would need two var ints only. And then the transaction responses only need one header, so you save 24 bytes for each transaction after the first. You could say that is a small benefit.

> I don't see what value this provides.  For protecting against the
> future you might as well suggest uploading x86 code which gets
> executed to select transactions. "Protects against the future".  Can
> you clarify some more about exactly how you think it would help?

Well it depends on wether you seriously think bitcoin blocks should be limited at a million bytes or not.

> it's not clear to me how your proposal is really all that useful for
> very large blocks: I looks like it would lot of bytes sending
> redundant tree data.

Look at bittorrent. With bittorrent you don't download files from a single peer all at once.

> Bitcoin gets its value through scarcity. There are two kinds of
> scarcity that are economically important, scarcity of the coins— there
> will never be more than 21 million— and scarcity of the block space
> which, as the protocol is defined and enforced by every node can not
> be more than 1MB. The latter scarcity is what makes the security model
> economically sane.

Why wouldn't requesting minimum fees in the software work as is done currently?

> Fortunately, its perfectly possible to make transactions denominated
> in bitcoin outside of the blockchain, and in a secure and distributed
> manner that respects the principles that make bitcoin attractive, but
> with information hiding that improves privacy, transaction speed, and
> scalability. See, e.g. the good work being done by Open transactions
> to create distributed cryptographic banks.  So blockchain scarcity
> itself doesn't prevent Bitcoin from being a one world currency
> (something which isn't at all sane no matter how big you make the
> blocks if you don't allow for other modes of transaction processing—
> who the heck wants to possibly wait an hour to get a 1 confirm
> sodapop??).

So what you essentially suggest is having bitcoin banks that maintain trust through Open Transaction contracts which contains proof of agreement, providing some legal protection? One wonders why have bitcoin at all then? Why not have an elaborate e-money system between several banks using Open Transactions? Bitcoin doesn't just contain proof of if something was done right or not, it contains actual certainty that it will be done right. And how does Open Transactions prevent fractional reserve fraud?

I suppose when people consider bitcoin banks, they will consider bitcoin being useless.

>  but I know that changing it is precisely as
> technically difficult as changing the 21 million limit

Set the change to occur at some block in the future leaving time for people to upgrade. Send out alert messages to notify users to upgrade. Issue is, some people might not like the change for whatever reasons.

As far as I see it, if bitcoin won't scale, then it's worth looking at something different to bitcoin that will scale.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-11 21:48   ` Matthew Mitchell
@ 2012-09-11 23:22     ` Gregory Maxwell
  2012-09-13  8:42       ` Mike Hearn
  0 siblings, 1 reply; 18+ messages in thread
From: Gregory Maxwell @ 2012-09-11 23:22 UTC (permalink / raw)
  To: Matthew Mitchell; +Cc: bitcoin-development

On Tue, Sep 11, 2012 at 5:48 PM, Matthew Mitchell
<matthewmitchell@godofgod.co.uk> wrote:
> You wouldn't need to pipeline the requests, just place more than one inventory vector in get data, right? Well my messages would save the space of those inventory vectors. Instead of needing 36 byte inventory vectors for each transaction and a var int, you would need two var ints only. And then the transaction responses only need one header, so you save 24 bytes for each transaction after the first. You could say that is a small benefit.

But you only need to request the transactions you don't have. Most of
time you should already have almost all of the transactions.

> Look at bittorrent. With bittorrent you don't download files from a single peer all at once.

You can fetch transactions from multiple peers with just a simple
mechanism that gives you the headers plus the txn list. And if you
want ArgumentAdSomethingElse, thats what bittorrent does too: the
torrent file contains the list of block hashes, and you get it from
one place.

> Why wouldn't requesting minimum fees in the software work as is done currently?

Because there is no motivation not to set them to zero, if you don't
someone else will.  Right now the income from fees is hardly relevant,
and the ability to drive more non-existant because there isn't enough
load to create scarcity.

> So what you essentially suggest is having bitcoin banks that maintain trust through Open Transaction contracts which contains proof of agreement, providing some legal protection? One wonders why have bitcoin at all then? Why not have an elaborate e-money system between several banks using Open Transactions?

Because it can't resist inflation. You have to trust that the banks
won't conspire to their mutual benefit to inflate the base currency.
OT can make it so a 'bank' (which is really a distributed collection
of nodes, not a single point of trust) can trivially prove how much
"gold certificates" it has issued, but you also need to prove how much
'gold' exists and which keys hold it, and for that you need a _global_
consensus; which bitcoin provides...

If you don't like

>Bitcoin doesn't just contain proof of if something was done right or not, it contains actual certainty that it will be done right. And how does Open >Transactions prevent fractional reserve fraud?

Well, Bitcoin gives you no certainty that any particular transaction
will be confirmed at all, ever; so perhaps best not to overstate it
too much. But yes, Bitcoin is great. ... but all that greatness
depends on there being a way to fund enough computation so that
attacks are too costly to be justified and that the cost of
maintaining a system to fully validate the system's rules (e.g. that
the miners aren't mining duplicate txns to create inflation for
themselves) is low enough that it will naturally enormously
distributed so that a conspiracy is effectively impossible.  Otherwise
everything consolidates down to a few meganodes and the attractive
properties are all gone.

OT's issuers can prove how much bitcoin they hold on the blockchain
(by nothing more sophisticated than signmessage) and they can prove
how many tokens they've issued against it.

And I didn't mean to suggest OT as a unique solution. Another path is
ripple, the idea of which is a sort of a p2p hawala where you have
pairwise trust and debt. It can allow you to circulate around tokens
between a community of users and only settle infrequently (as
determined by your level of trust, the debt involved, and the cost of
the bitcoin transaction) against bitcoin.

> I suppose when people consider bitcoin banks, they will consider bitcoin being useless.

They already exist, in crappy centralized form— e.g. look at mtgox
codes and user to user instant transfers; and bitcoin isn't useless.
Plus some extra system of some kind is the _only_ way to securely
irreversible transactions which are reliably fast, so it's not like
there is any real prospect of using bitcoin directly for all kinds of
uses at scale. (yes, blocks are 'only' 10 minutes apart on average,
but if you care about fast, you care about e.g. the 99% not the
average)

> As far as I see it, if bitcoin won't scale, then it's worth looking at something different to bitcoin that will scale.

Bitcoin scales fine.  It is not a singular replacement for everything
you can imagine it being a replacement for, however, or at least not a
good replacement.  The fact that you could conceivably make it
directly scale up to handle e.g. the volume of all the credit network
doesn't make that a good idea. It would still be a very poor
replacement for a credit network (slow transactions; which can't be
fixed by tweaking some parameters, the bitcoin blockchain consensus
algorithm has infinite convergence time when the block time falls
below the hash-power-weighed latency), and that kind of scaling would
absolutely ruin the decentralization, making it so only large states
and megabanks could run full nodes, and even at that level it couldn't
match the worldwide volume of cash transactions or 'internal' money
transactions (like money moving around on all the poker tables in the
world).   It's like someone made the mistake of saying the floor wax
is edible (linseed oil) and now you complain that its a crappy desert
topping. :P

Maybe people will ultimately agree to raise the block sizes, but I
expect and hope that they'll only do so when it is entirely
uncontroversial that doing so won't significantly degrade the
decentralization (certainly not the case today: a large portion of the
network appears to have trouble keeping up with large blocks right
now, though upcoming software improvements will help enormously), or
the mining economics.   And yes, of course, you schedule the change
for the future, but as you note that it doesn't solve the problem of
people opposing it.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-11 23:22     ` Gregory Maxwell
@ 2012-09-13  8:42       ` Mike Hearn
  2012-09-13 14:05         ` Matthew Mitchell
  0 siblings, 1 reply; 18+ messages in thread
From: Mike Hearn @ 2012-09-13  8:42 UTC (permalink / raw)
  To: Gregory Maxwell; +Cc: bitcoin-development

For what it's worth I disagree with Gregory on nearly all these
points, so don't take it as some kind of consensus from the Bitcoin
community ;)

Matts change is reasonable but I think we all agree it has minimal
impact at the moment relative to other things, so something even more
complex than that seems like a non-starter. Bloom filtering is a lot
more important.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-13  8:42       ` Mike Hearn
@ 2012-09-13 14:05         ` Matthew Mitchell
  2012-09-13 15:16           ` Gregory Maxwell
  0 siblings, 1 reply; 18+ messages in thread
From: Matthew Mitchell @ 2012-09-13 14:05 UTC (permalink / raw)
  To: Mike Hearn, Gregory Maxwell, bitcoin-development

On 13 Sep 2012, at 09:42, Mike Hearn <mike@plan99.net> wrote:

> For what it's worth I disagree with Gregory on nearly all these
> points, so don't take it as some kind of consensus from the Bitcoin
> community ;)
> 
> Matts change is reasonable but I think we all agree it has minimal
> impact at the moment relative to other things, so something even more
> complex than that seems like a non-starter. Bloom filtering is a lot
> more important.

Sure other things may be done before this, I was seeing this as a change somewhere down the line but not urgent.

@Gregory

> But you only need to request the transactions you don't have. Most of
> time you should already have almost all of the transactions.

Yes, my proposal allows you to do this. You skip out transactions your already have. My proposal is simply better than others because it takes full advantage of the merkle tree structure with minor additions that are simple to implement. How hard is it to get the hashes at a particular level of a merkle tree? Not hard at all. How hard is it to place a selection of transactions from a block into a message Not hard at all. Implementation of the protocol requirements would be a piece of cake. The harder bit would be to create an algorithm to determine the best level of segmentation but this is not required to comply with the protocol.

> Because there is no motivation not to set them to zero, if you don't
> someone else will

The motivation to incentivise miners and maintain stronger security? The difficulty only has to be high enough to prevent a cartel of malicious miners taking control of the network, something that is potentially a problem today with the large mining pools. Remember that the more transactions there are the more fees there can be for miners to collect. The more people that are using bitcoin, the greater bitcoins will be worth. A bigger network should be good for miners when relying on fees.

> And yes, of course, you schedule the change
> for the future, but as you note that it doesn't solve the problem of
> people opposing it.

If it's so controversial that it creates a split making two separated currencies then I'd see it turning out like the format wars (VHS vs Betamax and Blu-ray vs HD-DVD). Eventually people will move towards one or the other since it's better for people to have universalised agreement on a system.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-13 14:05         ` Matthew Mitchell
@ 2012-09-13 15:16           ` Gregory Maxwell
       [not found]             ` <2B95CF41-4304-4D2A-9ABF-198D97B7449B@godofgod.co.uk>
  0 siblings, 1 reply; 18+ messages in thread
From: Gregory Maxwell @ 2012-09-13 15:16 UTC (permalink / raw)
  To: Matthew Mitchell; +Cc: bitcoin-development

On Thu, Sep 13, 2012 at 10:05 AM, Matthew Mitchell
<matthewmitchell@godofgod.co.uk> wrote:

> @Gregory
>
>> But you only need to request the transactions you don't have. Most of
>> time you should already have almost all of the transactions.
>
> Yes, my proposal allows you to do this. You skip out transactions your already have. My proposal is simply better than others because it takes full advantage of the merkle tree structure with minor additions that are simple to implement. How hard is it to get the hashes at a particular level of a merkle tree? Not hard at all. How hard is it to place a selection of transactions from a block into a message Not hard at all. Implementation of the protocol requirements would be a piece of cake. The harder bit would be to create an algorithm to determine the best level of segmentation but this is not required to comply with the protocol.

Sorry, I'm still not seeing what the value is.  How is the tree level
useful to anyone?  If you did want to get only parts of the
transaction list, why not just ranges from the lowest level?



^ permalink raw reply	[flat|nested] 18+ messages in thread

[parent not found: <2B95CF41-4304-4D2A-9ABF-198D97B7449B@godofgod.co.uk>]

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
       [not found]             ` <2B95CF41-4304-4D2A-9ABF-198D97B7449B@godofgod.co.uk>
@ 2012-09-13 15:46               ` Matthew Mitchell
       [not found]               ` <CAAS2fgQi8QFwU2M=wLiDodt3SmO48vUV5Sp3YCb1OmGJ5m=E7Q@mail.gmail.com>
  1 sibling, 0 replies; 18+ messages in thread
From: Matthew Mitchell @ 2012-09-13 15:46 UTC (permalink / raw)
  To: bitcoin-development

On 13 Sep 2012, at 16:16, Gregory Maxwell <gmaxwell@gmail.com> wrote:

> Sorry, I'm still not seeing what the value is.  How is the tree level
> useful to anyone?  If you did want to get only parts of the
> transaction list, why not just ranges from the lowest level?

Obtaining a particular tree level allows you to verify segments without needing to download all the transaction hashes first. You only need one hash per segment. For instance if you want to divide the block into 8 segments you specify level 3 and download 8 hashes. You could download all transaction hashes if you wanted and it would still work, it just requires more data transfer for the hashes. This was the reason why merkle trees were used in bitcoin, to avoid requiring all hashes to verify data.

^ permalink raw reply	[flat|nested] 18+ messages in thread

[parent not found: <CAAS2fgQi8QFwU2M=wLiDodt3SmO48vUV5Sp3YCb1OmGJ5m=E7Q@mail.gmail.com>]

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
       [not found]               ` <CAAS2fgQi8QFwU2M=wLiDodt3SmO48vUV5Sp3YCb1OmGJ5m=E7Q@mail.gmail.com>
@ 2012-09-13 17:49                 ` Matthew Mitchell
  2012-09-13 18:59                   ` Pieter Wuille
  0 siblings, 1 reply; 18+ messages in thread
From: Matthew Mitchell @ 2012-09-13 17:49 UTC (permalink / raw)
  To: Gregory Maxwell, bitcoin-development

[-- Attachment #1: Type: text/plain, Size: 3213 bytes --]

On 13 Sep 2012, at 16:51, Gregory Maxwell <gmaxwell@gmail.com> wrote:

> I thoroughly understand the value of tree hashes. That wasn't what I
> was asking about.
> 
> If you're validating a block you need all the transactions, once you
> have them or their hashes you can build the tree without transferring
> more, e.g. what a fully validating node needs. If you're checking a
> single transaction to need the path from the transaction to the root
> (what a SPV nodes need, for example).
> 
> Can you spell out the 'end user'ish application for fetching a tree level?

A merkle tree root is found by hashing the two children together and those children are found the same way until you get to the greatest level down the tree. This means you can validate children as being correct as long as they hash together to form the root. This means you do not need all the transaction hashes to validate segments of the block, you only need the root hashes for all the segments you want. Let's say there are 8 transactions and you want to verify 4 segments (2 transactions each), The merkle tree looks like this (Might not work depending on the font):

Level 0:               *
                      / \
                     /   \
                    /     \
                   /       \
                  /         \
                 /           \
                /             \
Level 1:       *               *
              / \             / \
             /   \           /   \
            /     \         /     \
Level 2:   *       *       *       *
          / \     / \     / \     / \
Level 3: *   *   *   *   *   *   *   *

When you look at any non-leaf node you can see a separate merkle tree where the root can be found exactly the same as any other merkle tree. In this example you want four segments, so you ask for level 2. Now imagine a tree without level 3, you can validate the root with level 2. In fact you can validate that the root exists for any level. So you first check that the level 2 hashes do indeed calculate to the root. Once this is done you can now use these hashes to validate the segments. When you receive a segment, you check that segment against the segment's root. So you've validated the segment transactions for the segment root and the segment root was validated for the entire tree's root. You validate the segments for each segment root and this way you know all the transactions are valid for the four segments and thus are valid for the entire tree. This way you have downloaded 4 hashes instead of 8. 

Downloading the transactions hashes are therefore not necessary only the level for the segment roots. You might for instance want to divide the block into two segments in which case you ask for level 1 and download 2 hashes.

I hope that made sense.

And yes the merkle tree is particularly useful for validating a single transaction exists in a block as that saves a large proportion of the data required. The redundant data removed in the proposal here is smaller as a proportion of the total data (Because most of the data is the actual transactions themselves), so you might argue it's not worth it but it's simple to implement.

[-- Attachment #2: Type: text/html, Size: 5996 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-13 17:49                 ` Matthew Mitchell
@ 2012-09-13 18:59                   ` Pieter Wuille
  2012-09-13 20:24                     ` Matthew Mitchell
  0 siblings, 1 reply; 18+ messages in thread
From: Pieter Wuille @ 2012-09-13 18:59 UTC (permalink / raw)
  To: Matthew Mitchell; +Cc: bitcoin-development

On Thu, Sep 13, 2012 at 06:49:49PM +0100, Matthew Mitchell wrote:
> A merkle tree root is found by hashing the two children together and those children are found the same way until you get to the greatest level down the tree. This means you can validate children as being correct as long as they hash together to form the root. This means you do not need all the transaction hashes to validate segments of the block, you only need the root hashes for all the segments you want. Let's say there are 8 transactions and you want to verify 4 segments (2 transactions each), The merkle tree looks like this (Might not work depending on the font):

> I hope that made sense.

I'm quite sure Gregory thoroughly understands how Merkle trees work and why they are useful.

His question was about the use case. Let me try to answer his question, by making some assumptions about your intentions. Correct me if I'm wrong - I haven't read all details.

You want to parallellize block downloads, while at the same time preventing re-download of transactions that are already known.
To do so, a requesting node would first request (for example) the 8 level-3 hashes, then start 8 parallel threads to download the
transactions from presumably 8 different peers. Each thread then fetches the transaction id's that correspond to its own 1/8th of
the block, and requests the transactions whose txid is not yet known.
Comparing this with Gregory's own suggestion (just fetch the entire txid list at first, and then (again as parallellized as needed)
fetch the unknown transactions from several peers), this does indeed have an advantage: you need to download (relatively) far less
data before the threaded part can start. If this is what you propose, it is certainly meaningful, but the gains aren't very large,
in my opinion.

However, my impression while reading your proposal was at times that you intend to process the different layers of the
Merkle tree iteratively after starting the initial parallel segments. I don't think that is useful, as you'll need the actual
txids anyway before deciding whether they need to be downloaded at all, it adds several round-trips, and once you have downloaded
the intermediate merkle hashes for about 8 levels, you've already transferred more data than the transactions themselves. I think
Gregory also assumed something like this, making him question why it's useful. Anyway, this whole paragraph is one assumption, so
if it's not the case, ignore me.

Can you clarify what you mean? Preferably by giving a concrete sequence of exchanged messages, with a given purpose?

PS: the reason Satoshi used a Merkle tree for the transactions in a block was to allow a short piece of data (the hashes along a
path in the tree) to prove a transaction belongs to it - I doubt he intended it for parallellizing downloads (though it certainly
opens some opportunities here).

-- 
Pieter

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-13 18:59                   ` Pieter Wuille
@ 2012-09-13 20:24                     ` Matthew Mitchell
  0 siblings, 0 replies; 18+ messages in thread
From: Matthew Mitchell @ 2012-09-13 20:24 UTC (permalink / raw)
  To: Pieter Wuille, bitcoin-development

On 13 Sep 2012, at 19:59, Pieter Wuille <pieter.wuille@gmail.com> wrote:

> You want to parallellize block downloads, while at the same time preventing re-download of transactions that are already known.
> To do so, a requesting node would first request (for example) the 8 level-3 hashes, then start 8 parallel threads to download the
> transactions from presumably 8 different peers. Each thread then fetches the transaction id's that correspond to its own 1/8th of
> the block, and requests the transactions whose txid is not yet known.
> Comparing this with Gregory's own suggestion (just fetch the entire txid list at first, and then (again as parallellized as needed)
> fetch the unknown transactions from several peers), this does indeed have an advantage: you need to download (relatively) far less
> data before the threaded part can start. If this is what you propose, it is certainly meaningful, but the gains aren't very large,
> in my opinion.

This is not fully correct. I propose downloading the roots of the segments when you are not looking to remove redundant transaction downloads. This would be the case when the node is not up-to-date and is unlikely to have transactions in the required blocks. When a node is up-to-date and can benefit from removing redundant transaction downloads it can switch to downloading all the transactions hashes by specifying a high level number. Also I wouldn't recommend using one thread per connection, I'd recommend using one thread for all connections.

> However, my impression while reading your proposal was at times that you intend to process the different layers of the
> Merkle tree iteratively after starting the initial parallel segments. I don't think that is useful, as you'll need the actual
> txids anyway before deciding whether they need to be downloaded at all, it adds several round-trips, and once you have downloaded
> the intermediate merkle hashes for about 8 levels, you've already transferred more data than the transactions themselves. I think
> Gregory also assumed something like this, making him question why it's useful. Anyway, this whole paragraph is one assumption, so
> if it's not the case, ignore me.

This isn't what I was suggesting. I was suggesting you only need to download one level. Once you have done that you verify everything for the hashes on that level.

> 
> Can you clarify what you mean? Preferably by giving a concrete sequence of exchanged messages, with a given purpose?

Well you will need to ask for the headers first. Once you do that you can start downloading the full blocks for the headers.  The node should use "get blocks" to find nodes with segments of the blocks they need. Now for each block:

1. Send getsiginv to a number of peers to know the segments of the blocks they have. 
2. Send gettreelevel requesting a level of the merkle tree from a peer that can provide it. When up-to-date use a high level to get the transaction hashes to find redundant data.
3. Validate the treelevel response
4. Send getsegment for each segment wanted (at the same time where possible) to the peers that have these segments. Skip transactions already known.
5. Validate the transactions in each segment received. Stop if the block is invalid and disconnect peers that give transactions which do not fit the merkle tree.
6. Revert to getdata if peers using the protocol cannot satisfy the block download.

When a valid block segment is received, include the block in inv and headers messages for other peers using the protocol. Thus relaying can begin before the entire block is downloaded.

I'm thinking about improvements to this proposal. I'll get to that tomorrow perhaps…

Thank you everyone for the replies.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Bitcoin-development] Segmented Block Relaying BIP draft.
@ 2012-09-10 15:07 Matthew Mitchell
  2012-09-10 15:14 ` Gregory Maxwell
  2012-09-10 18:59 ` Luke-Jr
  0 siblings, 2 replies; 18+ messages in thread
From: Matthew Mitchell @ 2012-09-10 15:07 UTC (permalink / raw)
  To: bitcoin-development

[-- Attachment #1: Type: text/plain, Size: 297 bytes --]

Here is a BIP draft for improving the block relaying and validation so that it can be done in parallel and so that redundancy can be removed. This becomes more beneficial the larger the block sizes are.

https://en.bitcoin.it/wiki/User:MatthewLM/ImprovedBlockRelayingProposal

Matthew Mitchell

[-- Attachment #2: Type: text/html, Size: 550 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-10 15:07 Matthew Mitchell
@ 2012-09-10 15:14 ` Gregory Maxwell
  2012-09-10 16:29   ` Matt Corallo
  2012-09-10 18:59 ` Luke-Jr
  1 sibling, 1 reply; 18+ messages in thread
From: Gregory Maxwell @ 2012-09-10 15:14 UTC (permalink / raw)
  To: Matthew Mitchell; +Cc: bitcoin-development

On Mon, Sep 10, 2012 at 11:07 AM, Matthew Mitchell
<matthewmitchell@godofgod.co.uk> wrote:
> Here is a BIP draft for improving the block relaying and validation so that
> it can be done in parallel and so that redundancy can be removed. This
> becomes more beneficial the larger the block sizes are.
>
> https://en.bitcoin.it/wiki/User:MatthewLM/ImprovedBlockRelayingProposal

Why does this focus on actually sending the hash tree?  The block
header + transaction list + transactions a node doesn't already know
(often just the coinbase) is enough.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-10 15:14 ` Gregory Maxwell
@ 2012-09-10 16:29   ` Matt Corallo
  0 siblings, 0 replies; 18+ messages in thread
From: Matt Corallo @ 2012-09-10 16:29 UTC (permalink / raw)
  To: bitcoin-development

I actually implemented parts of the header+ v<tx> stuff in a branch with
my bloom filter stuff, you can see it here:
https://github.com/TheBlueMatt/bitcoin/commits/bloom%2Brelayblock
Its pretty stupid and would be pretty easy to DoS/get it stuck/etc, but
in theory it works.  I don't see much reason why we'd need anything
significantly more complicated, but maybe there is a use-case I'm
missing?

Matt

On Mon, 2012-09-10 at 11:14 -0400, Gregory Maxwell wrote:
> On Mon, Sep 10, 2012 at 11:07 AM, Matthew Mitchell
> <matthewmitchell@godofgod.co.uk> wrote:
> > Here is a BIP draft for improving the block relaying and validation so that
> > it can be done in parallel and so that redundancy can be removed. This
> > becomes more beneficial the larger the block sizes are.
> >
> > https://en.bitcoin.it/wiki/User:MatthewLM/ImprovedBlockRelayingProposal
> 
> Why does this focus on actually sending the hash tree?  The block
> header + transaction list + transactions a node doesn't already know
> (often just the coinbase) is enough.




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-10 15:07 Matthew Mitchell
  2012-09-10 15:14 ` Gregory Maxwell
@ 2012-09-10 18:59 ` Luke-Jr
  2012-09-10 19:34   ` Matthew Mitchell
  1 sibling, 1 reply; 18+ messages in thread
From: Luke-Jr @ 2012-09-10 18:59 UTC (permalink / raw)
  To: bitcoin-development; +Cc: Matthew Mitchell

On Monday, September 10, 2012 3:07:52 PM Matthew Mitchell wrote:
> Here is a BIP draft for improving the block relaying and validation so that
> it can be done in parallel and so that redundancy can be removed. This
> becomes more beneficial the larger the block sizes are.
> 
> https://en.bitcoin.it/wiki/User:MatthewLM/ImprovedBlockRelayingProposal

Most of the problem with block propagation lies in implementation, not 
protocol... Distributing missing transaction on an as-needed basis is a 
possible improvement at the protocol level, but there hasn't (AFAIK) been any 
research into whether the little benefit outweighs the cost yet. In any case, 
I don't see why 6 new messages are needed instead of simply adding a single 
new type to getinv?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-10 18:59 ` Luke-Jr
@ 2012-09-10 19:34   ` Matthew Mitchell
  2012-09-10 19:53     ` Matt Corallo
  0 siblings, 1 reply; 18+ messages in thread
From: Matthew Mitchell @ 2012-09-10 19:34 UTC (permalink / raw)
  To: Luke-Jr, bitcoin-development

[-- Attachment #1: Type: text/plain, Size: 2400 bytes --]

Do you mean getdata? Here is the reason for the 6 new messages:

getseginv,seginv - These are for learning about what segments of a block a node has. Else you could remove these messages and simply have nodes advertise blocks via inventory messages. In this case nodes would have to wait until they had fully received a block before relaying anything. No longer is there a benefit with nodes being able to relay segments of blocks before they have received the entire block.

gettreelevel,treelevel - These are to received a level of the merle tree. Instead you might use get data but gettreelevel is more compact than get data and is clearly differentiates itself as part of the new protocol. Perhaps these messages could include the block headers alongside the hashes and you could request many at once like with the getheaders message? If you skip these messages, then you could verify the transactions at the end but there would be problems when peers give bad segments where data would need to be downloaded again.

getsegment,segment - These are clearly important to request and receive segments for the blocks. These allows for nodes to download arbitrary segments of blocks. The optimum number of segments could be calculated by node software using measurements of download speeds and latency times, the number of connections and how likely redundancy is to occur. If a node is up-to-date and likely has many of the transactions in blocks, it can start asking for the deepest merle level (tx hashes) and ask nodes for segments, avoiding transactions it already has.

I'll get around to doing measurements myself sometime to estimate the benefit of this proposal. It will certainly be beneficial when block sizes reach some size but not much is really known except what can be assumed/guessed.

I should also mention the bitcointalk topic here: https://bitcointalk.org/index.php?topic=103295.0

On 10 Sep 2012, at 19:59, "Luke-Jr" <luke@dashjr.org> wrote:
> 
> Most of the problem with block propagation lies in implementation, not 
> protocol... Distributing missing transaction on an as-needed basis is a 
> possible improvement at the protocol level, but there hasn't (AFAIK) been any 
> research into whether the little benefit outweighs the cost yet. In any case, 
> I don't see why 6 new messages are needed instead of simply adding a single 
> new type to getinv?

[-- Attachment #2: Type: text/html, Size: 5522 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-10 19:34   ` Matthew Mitchell
@ 2012-09-10 19:53     ` Matt Corallo
  2012-09-10 20:00       ` Gregory Maxwell
  0 siblings, 1 reply; 18+ messages in thread
From: Matt Corallo @ 2012-09-10 19:53 UTC (permalink / raw)
  To: bitcoin-development

It seems to me the whole idea of segmenting blocks would add very little
(to nothing) with any sane block size.  Sure, if a block were to be
10GB, it may make sense.  However, even in that case, it would be easier
to relay a list of tx hashes (which may be a bit expensive) and txes
separately instead of using a notion of block segments.  That said, I
don't see blocks ever being that large and if they do become that large,
as only a few full nodes will remain, upgrading their protocol would be
(relatively) easy.  I would instead encourage focus on decreasing block
relay times for the current network and as blocks approach 10MB (so that
they can approach 10MB).

Matt

On Mon, 2012-09-10 at 20:34 +0100, Matthew Mitchell wrote:
> Do you mean getdata? Here is the reason for the 6 new messages:
> 
> 
> getseginv,seginv - These are for learning about what segments of a
> block a node has. Else you could remove these messages and simply have
> nodes advertise blocks via inventory messages. In this case nodes
> would have to wait until they had fully received a block before
> relaying anything. No longer is there a benefit with nodes being able
> to relay segments of blocks before they have received the entire
> block.
> 
> 
> gettreelevel,treelevel - These are to received a level of
> the merle tree. Instead you might use get data but gettreelevel is
> more compact than get data and is clearly differentiates itself as
> part of the new protocol. Perhaps these messages could include the
> block headers alongside the hashes and you could request many at once
> like with the getheaders message? If you skip these messages, then you
> could verify the transactions at the end but there would be problems
> when peers give bad segments where data would need to be downloaded
> again.
> 
> 
> getsegment,segment - These are clearly important to request and
> receive segments for the blocks. These allows for nodes
> to download arbitrary segments of blocks. The optimum number of
> segments could be calculated by node software using measurements of
> download speeds and latency times, the number of connections and how
> likely redundancy is to occur. If a node is up-to-date and likely has
> many of the transactions in blocks, it can start asking for the
> deepest merle level (tx hashes) and ask nodes for segments, avoiding
> transactions it already has.
> 
> 
> I'll get around to doing measurements myself sometime to estimate the
> benefit of this proposal. It will certainly be beneficial when block
> sizes reach some size but not much is really known except what can be
> assumed/guessed.
> 
> 
> I should also mention the bitcointalk topic
> here: https://bitcointalk.org/index.php?topic=103295.0
> 
> On 10 Sep 2012, at 19:59, "Luke-Jr" <luke@dashjr.org> wrote:
> > 
> > Most of the problem with block propagation lies in implementation,
> > not 
> > protocol... Distributing missing transaction on an as-needed basis
> > is a 
> > possible improvement at the protocol level, but there hasn't (AFAIK)
> > been any 
> > research into whether the little benefit outweighs the cost yet. In
> > any case, 
> > I don't see why 6 new messages are needed instead of simply adding a
> > single 
> > new type to getinv?





^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bitcoin-development] Segmented Block Relaying BIP draft.
  2012-09-10 19:53     ` Matt Corallo
@ 2012-09-10 20:00       ` Gregory Maxwell
  0 siblings, 0 replies; 18+ messages in thread
From: Gregory Maxwell @ 2012-09-10 20:00 UTC (permalink / raw)
  To: Matt Corallo; +Cc: bitcoin-development

On Mon, Sep 10, 2012 at 3:53 PM, Matt Corallo <bitcoin-list@bluematt.me> wrote:
> It seems to me the whole idea of segmenting blocks would add very little
> (to nothing) with any sane block size.  Sure, if a block were to be
> 10GB, it may make sense.  However, even in that case, it would be easier

As you know there is a hard protocol limit of 1MB.

If you're going to talk about doing that you are screwing with the
core economic promises of the system. (in particular, removing the cap
eliminates the only armwave we have for long term security).  But in
any case, removing it requires a complete and totally incompatible
hardfork, and at that point you can do whatever you want with the
protocol. Changing how blocks are fetched is almost incidental to the
number of other things that would be changed.  I don't think it makes
sense to design for that especially when something far simpler (as you
pointed out) is prudent for the design of bitcoin.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-09-13 20:24 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-11 19:07 [Bitcoin-development] Segmented Block Relaying BIP draft Matthew Mitchell
2012-09-11 19:42 ` Gregory Maxwell
2012-09-11 21:48   ` Matthew Mitchell
2012-09-11 23:22     ` Gregory Maxwell
2012-09-13  8:42       ` Mike Hearn
2012-09-13 14:05         ` Matthew Mitchell
2012-09-13 15:16           ` Gregory Maxwell
     [not found]             ` <2B95CF41-4304-4D2A-9ABF-198D97B7449B@godofgod.co.uk>
2012-09-13 15:46               ` Matthew Mitchell
     [not found]               ` <CAAS2fgQi8QFwU2M=wLiDodt3SmO48vUV5Sp3YCb1OmGJ5m=E7Q@mail.gmail.com>
2012-09-13 17:49                 ` Matthew Mitchell
2012-09-13 18:59                   ` Pieter Wuille
2012-09-13 20:24                     ` Matthew Mitchell
  -- strict thread matches above, loose matches on Subject: below --
2012-09-10 15:07 Matthew Mitchell
2012-09-10 15:14 ` Gregory Maxwell
2012-09-10 16:29   ` Matt Corallo
2012-09-10 18:59 ` Luke-Jr
2012-09-10 19:34   ` Matthew Mitchell
2012-09-10 19:53     ` Matt Corallo
2012-09-10 20:00       ` Gregory Maxwell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox