* [Bitcoin-development] Chain pruning
@ 2014-04-10 11:37 Mike Hearn
2014-04-10 11:57 ` Wladimir
0 siblings, 1 reply; 21+ messages in thread
From: Mike Hearn @ 2014-04-10 11:37 UTC (permalink / raw)
To: Pieter Wuille; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 731 bytes --]
Chain pruning is probably a separate thread, changing subject.
> Reason is that the actual blocks available are likely to change frequently
> (if
> you keep the last week of blocks
I doubt anyone would specify blocks to keep in terms of time. More likely
it'd be in terms of megabytes, as that's the actual resource constraint on
nodes. Given a block size average it's easy to go from megabytes to
num_blocks, so I had imagined it'd be a new addr field that specifies how
many blocks from the chain head are stored. Then you'd connect to some
nodes and if they indicate their chain head - num_blocks_stored is higher
than your current chain height, you'd do a getaddr and go looking for nodes
that are storing far enough back.
[-- Attachment #2: Type: text/html, Size: 1032 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 11:37 [Bitcoin-development] Chain pruning Mike Hearn @ 2014-04-10 11:57 ` Wladimir 2014-04-10 12:10 ` Gregory Maxwell 0 siblings, 1 reply; 21+ messages in thread From: Wladimir @ 2014-04-10 11:57 UTC (permalink / raw) To: Mike Hearn; +Cc: Bitcoin Dev [-- Attachment #1: Type: text/plain, Size: 1419 bytes --] On Thu, Apr 10, 2014 at 1:37 PM, Mike Hearn <mike@plan99.net> wrote: > Chain pruning is probably a separate thread, changing subject. > > >> Reason is that the actual blocks available are likely to change >> frequently (if >> you keep the last week of blocks > > > I doubt anyone would specify blocks to keep in terms of time. More likely > it'd be in terms of megabytes, as that's the actual resource constraint on > nodes. > Well with bitcoin, (average) time, number of blocks and (maximum) size are all related to each other so it doesn't matter how it is specified, it's always possible to give estimates of all three. As for implementation it indeed makes most sense to work with block ranges. > Given a block size average it's easy to go from megabytes to num_blocks, > so I had imagined it'd be a new addr field that specifies how many blocks > from the chain head are stored. Then you'd connect to some nodes and if > they indicate their chain head - num_blocks_stored is higher than your > current chain height, you'd do a getaddr and go looking for nodes that are > storing far enough back. > This assumes that nodes will always be storing the latest blocks. For dynamic nodes that take part in the consensus this makes sense. Just wondering: Would there be a use for a [static] node that, say, always serves only the first 100000 blocks? Or, even, a static range like block 100000 - 200000? Wladimir [-- Attachment #2: Type: text/html, Size: 2296 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 11:57 ` Wladimir @ 2014-04-10 12:10 ` Gregory Maxwell 2014-04-10 14:19 ` Wladimir 0 siblings, 1 reply; 21+ messages in thread From: Gregory Maxwell @ 2014-04-10 12:10 UTC (permalink / raw) To: Wladimir; +Cc: Bitcoin Dev On Thu, Apr 10, 2014 at 4:57 AM, Wladimir <laanwj@gmail.com> wrote: > Just wondering: Would there be a use for a [static] node that, say, always > serves only the first 100000 blocks? Or, even, a static range like block > 100000 - 200000? The last time we discussed this sipa collected data based on how often blocks were feteched as a function of their depth and there was a huge increase for recent blocks that didn't really level out until 2000 blocks or so— presumably its not uncommon for nodes to be offline for a week or two at a time. But sure I could see a fixed range as also being a useful contribution though I'm struggling to figure out what set of constraints would leave a node without following the consensus? Obviously it has bandwidth if you're expecting to contribute much in serving those historic blocks... and verifying is reasonably cpu cheap with fast ecdsa code. Maybe it has a lot of read only storage? I think it should be possible to express and use such a thing in the protocol even if I'm currently unsure as to why you wouldn't do 100000 - 200000 _plus_ the most recent 144 that you were already keeping around for reorgs. In terms of peer selection, if the blocks you need aren't covered by the nodes you're currently connected to I think you'd prefer to seek node nodes which have the least rare-ness in the ranges they offer. E.g. if you're looking for a block 50 from the tip, you're should probably not prefer to fetch it from someone with blocks 100000-150000 if its one of only 100 nodes that has that range. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 12:10 ` Gregory Maxwell @ 2014-04-10 14:19 ` Wladimir 2014-04-10 16:23 ` Brian Hoffman 0 siblings, 1 reply; 21+ messages in thread From: Wladimir @ 2014-04-10 14:19 UTC (permalink / raw) To: Gregory Maxwell; +Cc: Bitcoin Dev [-- Attachment #1: Type: text/plain, Size: 1979 bytes --] On Thu, Apr 10, 2014 at 2:10 PM, Gregory Maxwell <gmaxwell@gmail.com> wrote: > But sure I could see a fixed range as also being a useful contribution > though I'm struggling to figure out what set of constraints would > leave a node without following the consensus? Obviously it has > bandwidth if you're expecting to contribute much in serving those > historic blocks... and verifying is reasonably cpu cheap with fast > ecdsa code. Maybe it has a lot of read only storage? > The use case is that you could burn the node implementation + block data + a live operating system on a read-only medium. This could be set in stone for a long time. There would be no consensus code to keep up to date with protocol developments, because it doesn't take active part in it. I don't think it would be terribly useful right now, but it could be useful when nodes that host all history become rare. It'd allow distributing 'pieces of history' in a self-contained form. > I think it should be possible to express and use such a thing in the > protocol even if I'm currently unsure as to why you wouldn't do 100000 > - 200000 _plus_ the most recent 144 that you were already keeping > around for reorgs. > Yes, it would be nice to at least be able to express it, if it doesn't make the protocol too finicky. In terms of peer selection, if the blocks you need aren't covered by > the nodes you're currently connected to I think you'd prefer to seek > node nodes which have the least rare-ness in the ranges they offer. > E.g. if you're looking for a block 50 from the tip, you're should > probably not prefer to fetch it from someone with blocks 100000-150000 > if its one of only 100 nodes that has that range. > That makes sense. In general, if you want a block 50 from the tip, it would be best to request it from a node that only serves the last N (N>~50) blocks, and not a history node that could use the same bandwidth to serve earlier, rarer blocks to others. Wladimir [-- Attachment #2: Type: text/html, Size: 2795 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 14:19 ` Wladimir @ 2014-04-10 16:23 ` Brian Hoffman 2014-04-10 16:28 ` Mike Hearn 0 siblings, 1 reply; 21+ messages in thread From: Brian Hoffman @ 2014-04-10 16:23 UTC (permalink / raw) To: Wladimir; +Cc: Bitcoin Dev [-- Attachment #1: Type: text/plain, Size: 3678 bytes --] This is probably just noise, but what if nodes could compress and store earlier transaction sets (archive sets) and serve them up conditionally. So if there were let's say 100 archive sets of (10,000 blocks) you might have 5 open at any time when you're an active archive node while the others sit on your disk compressed and unavailable to the network. This would allow nodes to have all full transactions but conserve disk space and network activity since they wouldn't ever respond about every possible transaction. This could be based on a rotational request period, based on request count or done periodically. Once their considered active they would be expected to uncompress a set and make it available to the network. Clients would have to piece together archive sets from different nodes, but if there weren't enough archive nodes to cover the chain they could ratchet up the amount of required open archive sets when your node was active. I fully expect to have my idea trashed, but I'm dipping toes in the waters of contribution. On Thu, Apr 10, 2014 at 10:19 AM, Wladimir <laanwj@gmail.com> wrote: > > On Thu, Apr 10, 2014 at 2:10 PM, Gregory Maxwell <gmaxwell@gmail.com>wrote: > >> But sure I could see a fixed range as also being a useful contribution >> though I'm struggling to figure out what set of constraints would >> leave a node without following the consensus? Obviously it has >> bandwidth if you're expecting to contribute much in serving those >> historic blocks... and verifying is reasonably cpu cheap with fast >> ecdsa code. Maybe it has a lot of read only storage? >> > > The use case is that you could burn the node implementation + block data + > a live operating system on a read-only medium. This could be set in stone > for a long time. > > There would be no consensus code to keep up to date with protocol > developments, because it doesn't take active part in it. > > I don't think it would be terribly useful right now, but it could be > useful when nodes that host all history become rare. It'd allow > distributing 'pieces of history' in a self-contained form. > > >> I think it should be possible to express and use such a thing in the >> protocol even if I'm currently unsure as to why you wouldn't do 100000 >> - 200000 _plus_ the most recent 144 that you were already keeping >> around for reorgs. >> > > Yes, it would be nice to at least be able to express it, if it doesn't > make the protocol too finicky. > > In terms of peer selection, if the blocks you need aren't covered by >> the nodes you're currently connected to I think you'd prefer to seek >> node nodes which have the least rare-ness in the ranges they offer. >> E.g. if you're looking for a block 50 from the tip, you're should >> probably not prefer to fetch it from someone with blocks 100000-150000 >> if its one of only 100 nodes that has that range. >> > > That makes sense. > > In general, if you want a block 50 from the tip, it would be best to > request it from a node that only serves the last N (N>~50) blocks, and not > a history node that could use the same bandwidth to serve earlier, rarer > blocks to others. > > Wladimir > > > > ------------------------------------------------------------------------------ > Put Bad Developers to Shame > Dominate Development with Jenkins Continuous Integration > Continuously Automate Build, Test & Deployment > Start a new project now. Try Jenkins in the cloud. > http://p.sf.net/sfu/13600_Cloudbees > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > > [-- Attachment #2: Type: text/html, Size: 5202 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 16:23 ` Brian Hoffman @ 2014-04-10 16:28 ` Mike Hearn 2014-04-10 16:47 ` Brian Hoffman 2014-04-10 16:52 ` Ricardo Filipe 0 siblings, 2 replies; 21+ messages in thread From: Mike Hearn @ 2014-04-10 16:28 UTC (permalink / raw) To: Brian Hoffman; +Cc: Bitcoin Dev [-- Attachment #1: Type: text/plain, Size: 531 bytes --] Suggestions always welcome! The main problem with this is that the block chain is mostly random bytes (hashes, keys) so it doesn't compress that well. It compresses a bit, but not enough to change the fundamental physics. However, that does not mean the entire chain has to be stored on expensive rotating platters. I've suggested that in some star trek future where the chain really is gigantic, it could be stored on tape and spooled off at high speed. Literally a direct DMA from tape drive to NIC. But we're not there yet :) [-- Attachment #2: Type: text/html, Size: 617 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 16:28 ` Mike Hearn @ 2014-04-10 16:47 ` Brian Hoffman 2014-04-10 16:54 ` Ricardo Filipe 2014-04-10 16:59 ` Pieter Wuille 2014-04-10 16:52 ` Ricardo Filipe 1 sibling, 2 replies; 21+ messages in thread From: Brian Hoffman @ 2014-04-10 16:47 UTC (permalink / raw) To: Mike Hearn; +Cc: Bitcoin Dev [-- Attachment #1: Type: text/plain, Size: 1067 bytes --] Looks like only about ~30% disk space savings so I see your point. Is there a critical reason why blocks couldn't be formed into "superblocks" that are chained together and nodes could serve a specific superblock, which could be pieced together from different nodes to get the full blockchain? This would allow participants with limited resources to serve full portions of the blockchain rather than limited pieces of the entire blockchain. On Thu, Apr 10, 2014 at 12:28 PM, Mike Hearn <mike@plan99.net> wrote: > Suggestions always welcome! > > The main problem with this is that the block chain is mostly random bytes > (hashes, keys) so it doesn't compress that well. It compresses a bit, but > not enough to change the fundamental physics. > > However, that does not mean the entire chain has to be stored on expensive > rotating platters. I've suggested that in some star trek future where the > chain really is gigantic, it could be stored on tape and spooled off at > high speed. Literally a direct DMA from tape drive to NIC. But we're not > there yet :) > [-- Attachment #2: Type: text/html, Size: 1446 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 16:47 ` Brian Hoffman @ 2014-04-10 16:54 ` Ricardo Filipe 2014-04-10 16:56 ` Brian Hoffman 2014-04-10 16:59 ` Pieter Wuille 1 sibling, 1 reply; 21+ messages in thread From: Ricardo Filipe @ 2014-04-10 16:54 UTC (permalink / raw) To: Brian Hoffman; +Cc: Bitcoin Dev that's what blockchain pruning is all about :) 2014-04-10 17:47 GMT+01:00 Brian Hoffman <brianchoffman@gmail.com>: > Looks like only about ~30% disk space savings so I see your point. Is there > a critical reason why blocks couldn't be formed into "superblocks" that are > chained together and nodes could serve a specific superblock, which could be > pieced together from different nodes to get the full blockchain? This would > allow participants with limited resources to serve full portions of the > blockchain rather than limited pieces of the entire blockchain. > > > On Thu, Apr 10, 2014 at 12:28 PM, Mike Hearn <mike@plan99.net> wrote: >> >> Suggestions always welcome! >> >> The main problem with this is that the block chain is mostly random bytes >> (hashes, keys) so it doesn't compress that well. It compresses a bit, but >> not enough to change the fundamental physics. >> >> However, that does not mean the entire chain has to be stored on expensive >> rotating platters. I've suggested that in some star trek future where the >> chain really is gigantic, it could be stored on tape and spooled off at high >> speed. Literally a direct DMA from tape drive to NIC. But we're not there >> yet :) > > > > ------------------------------------------------------------------------------ > Put Bad Developers to Shame > Dominate Development with Jenkins Continuous Integration > Continuously Automate Build, Test & Deployment > Start a new project now. Try Jenkins in the cloud. > http://p.sf.net/sfu/13600_Cloudbees > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 16:54 ` Ricardo Filipe @ 2014-04-10 16:56 ` Brian Hoffman 0 siblings, 0 replies; 21+ messages in thread From: Brian Hoffman @ 2014-04-10 16:56 UTC (permalink / raw) To: Ricardo Filipe; +Cc: Bitcoin Dev [-- Attachment #1: Type: text/plain, Size: 1943 bytes --] Okay...will let myself out now ;P On Thu, Apr 10, 2014 at 12:54 PM, Ricardo Filipe <ricardojdfilipe@gmail.com>wrote: > that's what blockchain pruning is all about :) > > 2014-04-10 17:47 GMT+01:00 Brian Hoffman <brianchoffman@gmail.com>: > > Looks like only about ~30% disk space savings so I see your point. Is > there > > a critical reason why blocks couldn't be formed into "superblocks" that > are > > chained together and nodes could serve a specific superblock, which > could be > > pieced together from different nodes to get the full blockchain? This > would > > allow participants with limited resources to serve full portions of the > > blockchain rather than limited pieces of the entire blockchain. > > > > > > On Thu, Apr 10, 2014 at 12:28 PM, Mike Hearn <mike@plan99.net> wrote: > >> > >> Suggestions always welcome! > >> > >> The main problem with this is that the block chain is mostly random > bytes > >> (hashes, keys) so it doesn't compress that well. It compresses a bit, > but > >> not enough to change the fundamental physics. > >> > >> However, that does not mean the entire chain has to be stored on > expensive > >> rotating platters. I've suggested that in some star trek future where > the > >> chain really is gigantic, it could be stored on tape and spooled off at > high > >> speed. Literally a direct DMA from tape drive to NIC. But we're not > there > >> yet :) > > > > > > > > > ------------------------------------------------------------------------------ > > Put Bad Developers to Shame > > Dominate Development with Jenkins Continuous Integration > > Continuously Automate Build, Test & Deployment > > Start a new project now. Try Jenkins in the cloud. > > http://p.sf.net/sfu/13600_Cloudbees > > _______________________________________________ > > Bitcoin-development mailing list > > Bitcoin-development@lists.sourceforge.net > > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > > > [-- Attachment #2: Type: text/html, Size: 2833 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 16:47 ` Brian Hoffman 2014-04-10 16:54 ` Ricardo Filipe @ 2014-04-10 16:59 ` Pieter Wuille 2014-04-10 17:06 ` Brian Hoffman ` (2 more replies) 1 sibling, 3 replies; 21+ messages in thread From: Pieter Wuille @ 2014-04-10 16:59 UTC (permalink / raw) To: Brian Hoffman; +Cc: Bitcoin Dev On Thu, Apr 10, 2014 at 6:47 PM, Brian Hoffman <brianchoffman@gmail.com> wrote: > Looks like only about ~30% disk space savings so I see your point. Is there > a critical reason why blocks couldn't be formed into "superblocks" that are > chained together and nodes could serve a specific superblock, which could be > pieced together from different nodes to get the full blockchain? This would > allow participants with limited resources to serve full portions of the > blockchain rather than limited pieces of the entire blockchain. As this is a suggestion that I think I've seen come up once a month for the past 3 years, let's try to answer it thoroughly. The actual "state" of the blockchain is the UTXO set (stored in chainstate/ by the reference client). It's the set of all unspent transaction outputs at the currently active point in the block chain. It is all you need for validating future blocks. The problem is, you can't just give someone the UTXO set and expect them to trust it, as there is no way to prove that it was the result of processing the actual blocks. As Bitcoin's full node uses a "zero trust" model, where (apart from one detail: the order of otherwise valid transactions) it never assumes any data received from the outside it valid, it HAS to see the previous blocks in order to establish the validity of the current UTXO set. This is what initial block syncing does. Nothing but the actual blocks can provide this data, and it is why the actual blocks need to be available. It does not require everyone to have all blocks, though - they just need to have seen them during processing. A related, but not identical evolution is merkle UTXO commitments. This means that we shape the UTXO set as a merkle tree, compute its root after every block, and require that the block commits to this root hash (by putting it in the coinbase, for example). This means a full node can copy the chain state from someone else, and check that its hash matches what the block chain commits to. It's important to note that this is a strict reduction in security: we're now trusting that the longest chain (with most proof of work) commits to a valid UTXO set (at some point in the past). In essence, combining both ideas means you get "superblocks" (the UTXO set is essentially the summary of the result of all past blocks), in a way that is less-than-currently-but-perhaps-still-acceptably-validated. -- Pieter ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 16:59 ` Pieter Wuille @ 2014-04-10 17:06 ` Brian Hoffman 2014-04-10 18:19 ` Paul Rabahy 2014-04-10 21:34 ` Jesus Cea 2 siblings, 0 replies; 21+ messages in thread From: Brian Hoffman @ 2014-04-10 17:06 UTC (permalink / raw) To: Pieter Wuille; +Cc: Bitcoin Dev [-- Attachment #1: Type: text/plain, Size: 2931 bytes --] Ok I think I've got a good understanding of where we're at now. I can promise that the next person to waste your time in 30 days will not be me. I'm pleasantly surprised to see a community that doesn't kickban newcomers and takes the time to explain (re-explain) concepts. Hoping to add *beneficial* thoughts in the future! On Thu, Apr 10, 2014 at 12:59 PM, Pieter Wuille <pieter.wuille@gmail.com>wrote: > On Thu, Apr 10, 2014 at 6:47 PM, Brian Hoffman <brianchoffman@gmail.com> > wrote: > > Looks like only about ~30% disk space savings so I see your point. Is > there > > a critical reason why blocks couldn't be formed into "superblocks" that > are > > chained together and nodes could serve a specific superblock, which > could be > > pieced together from different nodes to get the full blockchain? This > would > > allow participants with limited resources to serve full portions of the > > blockchain rather than limited pieces of the entire blockchain. > > As this is a suggestion that I think I've seen come up once a month > for the past 3 years, let's try to answer it thoroughly. > > The actual "state" of the blockchain is the UTXO set (stored in > chainstate/ by the reference client). It's the set of all unspent > transaction outputs at the currently active point in the block chain. > It is all you need for validating future blocks. > > The problem is, you can't just give someone the UTXO set and expect > them to trust it, as there is no way to prove that it was the result > of processing the actual blocks. > > As Bitcoin's full node uses a "zero trust" model, where (apart from > one detail: the order of otherwise valid transactions) it never > assumes any data received from the outside it valid, it HAS to see the > previous blocks in order to establish the validity of the current UTXO > set. This is what initial block syncing does. Nothing but the actual > blocks can provide this data, and it is why the actual blocks need to > be available. It does not require everyone to have all blocks, though > - they just need to have seen them during processing. > > A related, but not identical evolution is merkle UTXO commitments. > This means that we shape the UTXO set as a merkle tree, compute its > root after every block, and require that the block commits to this > root hash (by putting it in the coinbase, for example). This means a > full node can copy the chain state from someone else, and check that > its hash matches what the block chain commits to. It's important to > note that this is a strict reduction in security: we're now trusting > that the longest chain (with most proof of work) commits to a valid > UTXO set (at some point in the past). > > In essence, combining both ideas means you get "superblocks" (the UTXO > set is essentially the summary of the result of all past blocks), in a > way that is less-than-currently-but-perhaps-still-acceptably-validated. > > -- > Pieter > [-- Attachment #2: Type: text/html, Size: 3609 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 16:59 ` Pieter Wuille 2014-04-10 17:06 ` Brian Hoffman @ 2014-04-10 18:19 ` Paul Rabahy 2014-04-10 18:32 ` Pieter Wuille 2014-04-10 19:36 ` Mark Friedenbach 2014-04-10 21:34 ` Jesus Cea 2 siblings, 2 replies; 21+ messages in thread From: Paul Rabahy @ 2014-04-10 18:19 UTC (permalink / raw) To: Pieter Wuille; +Cc: Bitcoin Dev [-- Attachment #1: Type: text/plain, Size: 3876 bytes --] You say UTXO commitments is "a strict reduction in security". If UTXO commitments were rolled in as a soft fork, I do not see any new attacks that could happen to a person trusting the committed UTXO + any remaining blocks to catch up to the head. I would imagine the soft fork to proceed similar to the following. 1. Miners begin including UTXO commitments. 2. Miners begin rejecting blocks with invalid UTXO commitments. 3. Miners begin rejecting blocks with no UTXO commitments. To start up a fresh client it would follow the following. 1. Sync headers. 2. Pick a committed UTXO that is deep enough to not get orphaned. 3. Sync blocks from commitment to head. I would argue that a client following this methodology is strictly more secure than SPV, and I don't see any attacks that make it less secure than a full client. It is obviously still susceptible to a 51% attack, but so is the traditional block chain. I also do not see any sybil attacks that are strengthened by this change because it is not modifying the networking code. I guess if the soft fork happened, then miners began to not include the UTXO commitment anymore, it would lower the overall network hash rate, but this would be self-harming to the miners so they have an incentive to not do it. Please let me know if I have missed something. On Thu, Apr 10, 2014 at 12:59 PM, Pieter Wuille <pieter.wuille@gmail.com>wrote: > > As this is a suggestion that I think I've seen come up once a month > for the past 3 years, let's try to answer it thoroughly. > > The actual "state" of the blockchain is the UTXO set (stored in > chainstate/ by the reference client). It's the set of all unspent > transaction outputs at the currently active point in the block chain. > It is all you need for validating future blocks. > > The problem is, you can't just give someone the UTXO set and expect > them to trust it, as there is no way to prove that it was the result > of processing the actual blocks. > > As Bitcoin's full node uses a "zero trust" model, where (apart from > one detail: the order of otherwise valid transactions) it never > assumes any data received from the outside it valid, it HAS to see the > previous blocks in order to establish the validity of the current UTXO > set. This is what initial block syncing does. Nothing but the actual > blocks can provide this data, and it is why the actual blocks need to > be available. It does not require everyone to have all blocks, though > - they just need to have seen them during processing. > > A related, but not identical evolution is merkle UTXO commitments. > This means that we shape the UTXO set as a merkle tree, compute its > root after every block, and require that the block commits to this > root hash (by putting it in the coinbase, for example). This means a > full node can copy the chain state from someone else, and check that > its hash matches what the block chain commits to. It's important to > note that this is a strict reduction in security: we're now trusting > that the longest chain (with most proof of work) commits to a valid > UTXO set (at some point in the past). > > In essence, combining both ideas means you get "superblocks" (the UTXO > set is essentially the summary of the result of all past blocks), in a > way that is less-than-currently-but-perhaps-still-acceptably-validated. > > -- > Pieter > > > ------------------------------------------------------------------------------ > Put Bad Developers to Shame > Dominate Development with Jenkins Continuous Integration > Continuously Automate Build, Test & Deployment > Start a new project now. Try Jenkins in the cloud. > http://p.sf.net/sfu/13600_Cloudbees > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > [-- Attachment #2: Type: text/html, Size: 4975 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 18:19 ` Paul Rabahy @ 2014-04-10 18:32 ` Pieter Wuille 2014-04-10 20:12 ` Tier Nolan 2014-04-10 19:36 ` Mark Friedenbach 1 sibling, 1 reply; 21+ messages in thread From: Pieter Wuille @ 2014-04-10 18:32 UTC (permalink / raw) To: Paul Rabahy; +Cc: Bitcoin Dev On Thu, Apr 10, 2014 at 8:19 PM, Paul Rabahy <prabahy@gmail.com> wrote: > Please let me know if I have missed something. A 51% attack can make you believe you were paid, while you weren't. Full node security right now validates everything - there is no way you can ever be made to believe something invalid. The only attacks against it are about which version of valid history eventually gets chosen. If you trust hashrate for determining which UTXO set is valid, a 51% attack becomes worse in that you can be made to believe a version of history which is in fact invalid. -- Pieter ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 18:32 ` Pieter Wuille @ 2014-04-10 20:12 ` Tier Nolan 2014-04-10 20:29 ` Pieter Wuille 0 siblings, 1 reply; 21+ messages in thread From: Tier Nolan @ 2014-04-10 20:12 UTC (permalink / raw) To: Pieter Wuille; +Cc: Bitcoin Dev, Paul Rabahy [-- Attachment #1: Type: text/plain, Size: 1775 bytes --] On Thu, Apr 10, 2014 at 7:32 PM, Pieter Wuille <pieter.wuille@gmail.com>wrote: > If you trust hashrate for determining which UTXO set is valid, a 51% > attack becomes worse in that you can be made to believe a version of > history which is in fact invalid. > If there are invalidation proofs, then this isn't strictly true. If you are connected to 10 nodes and only 1 is honest, it can send you the proof that your main chain is invalid. For bad scripts, it shows you the input transaction for the invalid input along with the merkle path to prove it is in a previous block. For double spends, it could show the transaction which spent the output. Double spends are pretty much the same as trying to spend non-existent outputs anyway. If the UTXO set commit was actually a merkle tree, then all updates could be included. Blocks could have extra data with the proofs that the UTXO set is being updated correctly. To update the UTXO set, you need the paths for all spent inputs. It puts a large load on miners to keep things working, since they have to run a full node. If they commit the data to the chain, then SPV nodes can do local checking. One of them will find invalid blocks eventually (even if one of the other miners don't). > > -- > Pieter > > > ------------------------------------------------------------------------------ > Put Bad Developers to Shame > Dominate Development with Jenkins Continuous Integration > Continuously Automate Build, Test & Deployment > Start a new project now. Try Jenkins in the cloud. > http://p.sf.net/sfu/13600_Cloudbees > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > [-- Attachment #2: Type: text/html, Size: 2742 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 20:12 ` Tier Nolan @ 2014-04-10 20:29 ` Pieter Wuille 0 siblings, 0 replies; 21+ messages in thread From: Pieter Wuille @ 2014-04-10 20:29 UTC (permalink / raw) To: Tier Nolan; +Cc: Bitcoin Dev, Paul Rabahy On Thu, Apr 10, 2014 at 10:12 PM, Tier Nolan <tier.nolan@gmail.com> wrote: > On Thu, Apr 10, 2014 at 7:32 PM, Pieter Wuille <pieter.wuille@gmail.com> > wrote: >> >> If you trust hashrate for determining which UTXO set is valid, a 51% >> attack becomes worse in that you can be made to believe a version of >> history which is in fact invalid. > > > If there are invalidation proofs, then this isn't strictly true. I'm aware of fraud proofs, and they're a very cool idea. They allow you to leverage some "herd immunity" in the system (assuming you'll be told about invalid data you received without actually validating it). However, they are certainly not the same thing as zero trust security a fully validating node offers. For example, a sybil attack that hides the actual best chain + fraud proofs from you, plus being fed a chain that commits to an invalid UTXO set. There are many ideas that make attacks harder, and they're probably good ideas to deploy, but there is little that achieves the security of a full node. (well, perhaps a zero-knowledge proof of having run the validation code against the claimed chain tip to produce the known UTXO set...). -- Pieter ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 18:19 ` Paul Rabahy 2014-04-10 18:32 ` Pieter Wuille @ 2014-04-10 19:36 ` Mark Friedenbach 1 sibling, 0 replies; 21+ messages in thread From: Mark Friedenbach @ 2014-04-10 19:36 UTC (permalink / raw) To: bitcoin-development You took the quote out of context: "a full node can copy the chain state from someone else, and check that its hash matches what the block chain commits to. It's important to note that this is a strict reduction in security: we're now trusting that the longest chain (with most proof of work) commits to a valid UTXO set (at some point in the past)." The described synchronization mechanism would be to determine the most-work block header (SPV level of security!), and then sync the UTXO set committed to within that block. This is strictly less security than building the UTXO set yourself because it is susceptible to a 51% attack which violates protocol rules. On 04/10/2014 11:19 AM, Paul Rabahy wrote: > You say UTXO commitments is "a strict reduction in security". If UTXO > commitments were rolled in as a soft fork, I do not see any new attacks > that could happen to a person trusting the committed UTXO + any > remaining blocks to catch up to the head. > > I would imagine the soft fork to proceed similar to the following. > 1. Miners begin including UTXO commitments. > 2. Miners begin rejecting blocks with invalid UTXO commitments. > 3. Miners begin rejecting blocks with no UTXO commitments. > > To start up a fresh client it would follow the following. > 1. Sync headers. > 2. Pick a committed UTXO that is deep enough to not get orphaned. > 3. Sync blocks from commitment to head. > > I would argue that a client following this methodology is strictly more > secure than SPV, and I don't see any attacks that make it less secure > than a full client. It is obviously still susceptible to a 51% attack, > but so is the traditional block chain. I also do not see any sybil > attacks that are strengthened by this change because it is not modifying > the networking code. > > I guess if the soft fork happened, then miners began to not include the > UTXO commitment anymore, it would lower the overall network hash rate, > but this would be self-harming to the miners so they have an incentive > to not do it. > > Please let me know if I have missed something. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 16:59 ` Pieter Wuille 2014-04-10 17:06 ` Brian Hoffman 2014-04-10 18:19 ` Paul Rabahy @ 2014-04-10 21:34 ` Jesus Cea 2014-04-10 22:15 ` Mark Friedenbach 2 siblings, 1 reply; 21+ messages in thread From: Jesus Cea @ 2014-04-10 21:34 UTC (permalink / raw) To: bitcoin-development [-- Attachment #1: Type: text/plain, Size: 1391 bytes --] On 10/04/14 18:59, Pieter Wuille wrote: > It's important to > note that this is a strict reduction in security: we're now trusting > that the longest chain (with most proof of work) commits to a valid > UTXO set (at some point in the past). AFAIK, current bitcoin code code already set blockchain checkpoints from time to time. It is a garanteed that a longer chain starting before the checkpoint is not going to be accepted suddently. See <https://bitcointalk.org/index.php?topic=194078.0>. Could be perfectly valid to store only unspend wallets before last checkpoint, if during the blockchain download the node did all the checks. Would be interesting, of course, to be able to verify "unspend wallet accounting" having only that checkpoint data (the merkle tree can do that, I guess). So you could detect a data corruption or manipulation in your local harddisk. -- Jesús Cea Avión _/_/ _/_/_/ _/_/_/ jcea@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ Twitter: @jcea _/_/ _/_/ _/_/_/_/_/ jabber / xmpp:jcea@jabber.org _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 538 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 21:34 ` Jesus Cea @ 2014-04-10 22:15 ` Mark Friedenbach 2014-04-10 22:24 ` Jesus Cea 0 siblings, 1 reply; 21+ messages in thread From: Mark Friedenbach @ 2014-04-10 22:15 UTC (permalink / raw) To: bitcoin-development Checkpoints will go away, eventually. On 04/10/2014 02:34 PM, Jesus Cea wrote: > On 10/04/14 18:59, Pieter Wuille wrote: >> It's important to >> note that this is a strict reduction in security: we're now trusting >> that the longest chain (with most proof of work) commits to a valid >> UTXO set (at some point in the past). > > AFAIK, current bitcoin code code already set blockchain checkpoints from > time to time. It is a garanteed that a longer chain starting before the > checkpoint is not going to be accepted suddently. See > <https://bitcointalk.org/index.php?topic=194078.0>. > > Could be perfectly valid to store only unspend wallets before last > checkpoint, if during the blockchain download the node did all the checks. > > Would be interesting, of course, to be able to verify "unspend wallet > accounting" having only that checkpoint data (the merkle tree can do > that, I guess). So you could detect a data corruption or manipulation in > your local harddisk. > > > > ------------------------------------------------------------------------------ > Put Bad Developers to Shame > Dominate Development with Jenkins Continuous Integration > Continuously Automate Build, Test & Deployment > Start a new project now. Try Jenkins in the cloud. > http://p.sf.net/sfu/13600_Cloudbees > > > > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 22:15 ` Mark Friedenbach @ 2014-04-10 22:24 ` Jesus Cea 2014-04-10 22:33 ` Gregory Maxwell 0 siblings, 1 reply; 21+ messages in thread From: Jesus Cea @ 2014-04-10 22:24 UTC (permalink / raw) To: bitcoin-development [-- Attachment #1: Type: text/plain, Size: 650 bytes --] On 11/04/14 00:15, Mark Friedenbach wrote: > Checkpoints will go away, eventually. Why?. The points in the forum thread seem pretty sensible. -- Jesús Cea Avión _/_/ _/_/_/ _/_/_/ jcea@jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ Twitter: @jcea _/_/ _/_/ _/_/_/_/_/ jabber / xmpp:jcea@jabber.org _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 538 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 22:24 ` Jesus Cea @ 2014-04-10 22:33 ` Gregory Maxwell 0 siblings, 0 replies; 21+ messages in thread From: Gregory Maxwell @ 2014-04-10 22:33 UTC (permalink / raw) To: Jesus Cea; +Cc: Bitcoin Development On Thu, Apr 10, 2014 at 3:24 PM, Jesus Cea <jcea@jcea.es> wrote: > On 11/04/14 00:15, Mark Friedenbach wrote: >> Checkpoints will go away, eventually. > Why?. The points in the forum thread seem pretty sensible. Because with headers first synchronization the major problems that they solve— e.g. block flooding DOS attacks, weak chain isolation, and checking shortcutting can be addressed in other more efficient ways that don't result in putting trust in third parties. They also cause really severe confusion about the security model. Instead you can embed in software knoweldge that the longest chain is "at least this long" to prevent isolation attacks, which is a lot simpler and less trusting. You can also do randomized validation of the deeply burried old history for performance, instead of constantly depending on 'trusted parties' to update software or it gets slower over time, and locally save your own validation fingerprints so if you need to reinitilize data you can remember what you've check so far by hash. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] Chain pruning 2014-04-10 16:28 ` Mike Hearn 2014-04-10 16:47 ` Brian Hoffman @ 2014-04-10 16:52 ` Ricardo Filipe 1 sibling, 0 replies; 21+ messages in thread From: Ricardo Filipe @ 2014-04-10 16:52 UTC (permalink / raw) To: Mike Hearn; +Cc: Bitcoin Dev anyway, any kind of compression that comes to the blockchain is orthogonal to pruning. I agree that you will probably want some kind of replication on more recent nodes than on older ones. However, nodes with older blocks don't need to be "static", get the block distribution algorithm to sort it out. 2014-04-10 17:28 GMT+01:00 Mike Hearn <mike@plan99.net>: > Suggestions always welcome! > > The main problem with this is that the block chain is mostly random bytes > (hashes, keys) so it doesn't compress that well. It compresses a bit, but > not enough to change the fundamental physics. > > However, that does not mean the entire chain has to be stored on expensive > rotating platters. I've suggested that in some star trek future where the > chain really is gigantic, it could be stored on tape and spooled off at high > speed. Literally a direct DMA from tape drive to NIC. But we're not there > yet :) > > ------------------------------------------------------------------------------ > Put Bad Developers to Shame > Dominate Development with Jenkins Continuous Integration > Continuously Automate Build, Test & Deployment > Start a new project now. Try Jenkins in the cloud. > http://p.sf.net/sfu/13600_Cloudbees > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2014-04-10 22:33 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-04-10 11:37 [Bitcoin-development] Chain pruning Mike Hearn 2014-04-10 11:57 ` Wladimir 2014-04-10 12:10 ` Gregory Maxwell 2014-04-10 14:19 ` Wladimir 2014-04-10 16:23 ` Brian Hoffman 2014-04-10 16:28 ` Mike Hearn 2014-04-10 16:47 ` Brian Hoffman 2014-04-10 16:54 ` Ricardo Filipe 2014-04-10 16:56 ` Brian Hoffman 2014-04-10 16:59 ` Pieter Wuille 2014-04-10 17:06 ` Brian Hoffman 2014-04-10 18:19 ` Paul Rabahy 2014-04-10 18:32 ` Pieter Wuille 2014-04-10 20:12 ` Tier Nolan 2014-04-10 20:29 ` Pieter Wuille 2014-04-10 19:36 ` Mark Friedenbach 2014-04-10 21:34 ` Jesus Cea 2014-04-10 22:15 ` Mark Friedenbach 2014-04-10 22:24 ` Jesus Cea 2014-04-10 22:33 ` Gregory Maxwell 2014-04-10 16:52 ` Ricardo Filipe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox