* [Bitcoin-development] New P2P commands for diagnostics, SPV clients @ 2012-06-13 20:46 Jeff Garzik 2012-06-14 11:52 ` Mike Hearn 0 siblings, 1 reply; 21+ messages in thread From: Jeff Garzik @ 2012-06-13 20:46 UTC (permalink / raw) To: Bitcoin Development An IRC discussion today covered additional needs of lightweight clients. Here is a draft of proposed new P2P commands, and associated behavior changes. This is not meant to be a formal, detailed specification but rather rough picture of the preferred direction. ----- filterinit(false positive rate, number of elements): initialize per-connection bloom filter to the given parameters. if the parameters create a too-large table, the operation fails. returns a 'filterparams' message, with bloom filter construction details. filterload(data): input a serialized bloom filter table metadata and data. filterclear(): remove any filtering associated with current connection. filteradd(hash data): add a single hash to the bloom filter. WARNING: although easier to use, has privacy implications. filterload shrouds the hash list; filteradd does not. it is also less efficient to send a stream of filteradd's to the remote node. mempool(): list TX's in remote node's memory pool. ----- 'filterload' and 'filteradd' enable special behavior changes for 'mempool' and existing P2P commands, whereby only transactions matching the bloom filter will be announced to the connection, and only matching transactions will be sent inside serialized blocks. A lightweight ("SPV") client would issue 'filterload', sync up with blocks, then use 'mempool' to sync up to current TX's. The 'filterload' command also ensures that the client is only sent 'inv' messages etc. for the TX's it is probably interested in. The 'mempool' command is thought to be useful as a diagnostic, even if a bloom filter is not applied to its output. A bloom filter match would need to notice activity on existing coins (via CTxIn->prevout) and activity on a bitcoin address (via CTxOut). -- Jeff Garzik exMULTI, Inc. jgarzik@exmulti.com ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-13 20:46 [Bitcoin-development] New P2P commands for diagnostics, SPV clients Jeff Garzik @ 2012-06-14 11:52 ` Mike Hearn 2012-06-15 11:52 ` Mike Hearn ` (3 more replies) 0 siblings, 4 replies; 21+ messages in thread From: Mike Hearn @ 2012-06-14 11:52 UTC (permalink / raw) To: Jeff Garzik; +Cc: Bitcoin Development > filterinit(false positive rate, number of elements): initialize > filterload(data): input a serialized bloom filter table metadata and data. Why not combine these two? > 'filterload' and 'filteradd' enable special behavior changes for > 'mempool' and existing P2P commands, whereby only transactions > matching the bloom filter will be announced to the connection, and > only matching transactions will be sent inside serialized blocks. Need to specify the format of how these arrive. It means that when a new block is found instead of inv<->getdata<->block we'd see something like inv<->getdata<->merkleblock where a "merkleblock" structure is a header + list of transactions + list of merkle branches linking them to the root. I think CMerkleTx already knows how to serialize this, but it redundantly includes the block hash which would not be necessary for a merkleblock message. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-14 11:52 ` Mike Hearn @ 2012-06-15 11:52 ` Mike Hearn 2012-06-15 13:19 ` Matt Corallo ` (2 subsequent siblings) 3 siblings, 0 replies; 21+ messages in thread From: Mike Hearn @ 2012-06-15 11:52 UTC (permalink / raw) To: Jeff Garzik; +Cc: Bitcoin Development > Need to specify the format of how these arrive. It means that when a > new block is found instead of inv<->getdata<->block we'd see something > like inv<->getdata<->merkleblock where a "merkleblock" structure is a > header + list of transactions + list of merkle branches linking them > to the root. Thinking about it some more and re-reading the Scalability wiki page, I remembered that a nice bandwidth optimization to the protocol is to distribute blocks as header+list of tx hashes. If a node has already seen that tx before (eg, it's in the mempool) there is no need to send it again. With the new command to download the contents of the mempool on startup, this means that blocks could potentially propagate across the network faster as download time is taken out of the equation, and indeed, with the signature cache the hard work of verifying is already done. So this could also help reduce orphan blocks and spurious chain splits. Are you planning on implementing any of this Jeff? I think we have the opportunity to kill a few birds with one or two stones. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-14 11:52 ` Mike Hearn 2012-06-15 11:52 ` Mike Hearn @ 2012-06-15 13:19 ` Matt Corallo 2012-06-15 13:23 ` Mike Hearn 2012-06-15 13:26 ` Jeff Garzik 2012-06-15 15:43 ` Simon Barber 3 siblings, 1 reply; 21+ messages in thread From: Matt Corallo @ 2012-06-15 13:19 UTC (permalink / raw) To: Mike Hearn; +Cc: Bitcoin Development On Thu, 2012-06-14 at 13:52 +0200, Mike Hearn wrote: > > filterinit(false positive rate, number of elements): initialize > > filterload(data): input a serialized bloom filter table metadata and data. > > Why not combine these two? I believe its because it allows the node which will have to use the bloom filter to scan transactions to chose how much effort it wants to put into each transaction on behalf of the SPV client. Though its generally a small amount of CPU time/memory, if we end up with a drastic split between SPV nodes and only a few large network nodes, those nodes may wish to limit the CPU/memory usage each node is allowed to use, which may be important if you are serving 1000 SPV peers. It offers a sort of negotiation between SPV client and full node instead of letting the client specify it outright. > > > 'filterload' and 'filteradd' enable special behavior changes for > > 'mempool' and existing P2P commands, whereby only transactions > > matching the bloom filter will be announced to the connection, and > > only matching transactions will be sent inside serialized blocks. > > Need to specify the format of how these arrive. It means that when a > new block is found instead of inv<->getdata<->block we'd see something > like inv<->getdata<->merkleblock where a "merkleblock" structure is a > header + list of transactions + list of merkle branches linking them > to the root. I think CMerkleTx already knows how to serialize this, > but it redundantly includes the block hash which would not be > necessary for a merkleblock message. A series of CMerkleTx's might also end up redundantly encoding branches of the merkle tree, so, yes as a part of the BIP/implementation, I would say we probably want a CFilteredBlock or similar ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-15 13:19 ` Matt Corallo @ 2012-06-15 13:23 ` Mike Hearn 2012-06-15 14:39 ` Matt Corallo 0 siblings, 1 reply; 21+ messages in thread From: Mike Hearn @ 2012-06-15 13:23 UTC (permalink / raw) To: Matt Corallo; +Cc: Bitcoin Development > > Why not combine these two? > > I believe its because it allows the node which will have to use the > bloom filter to scan transactions to chose how much effort it wants to > put into each transaction on behalf of the SPV client. If that's the case then the negotiation protocol needs to be specified too. It seems heavy though. If a node is getting overloaded it could just disconnect intensive peers or refuse new connections. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-15 13:23 ` Mike Hearn @ 2012-06-15 14:39 ` Matt Corallo 2012-06-16 8:27 ` Mike Hearn 0 siblings, 1 reply; 21+ messages in thread From: Matt Corallo @ 2012-06-15 14:39 UTC (permalink / raw) To: Mike Hearn; +Cc: Bitcoin Development On Fri, 2012-06-15 at 15:23 +0200, Mike Hearn wrote: > > > Why not combine these two? > > > > I believe its because it allows the node which will have to use the > > bloom filter to scan transactions to chose how much effort it wants to > > put into each transaction on behalf of the SPV client. > > If that's the case then the negotiation protocol needs to be specified > too. It seems heavy though. If a node is getting overloaded it could > just disconnect intensive peers or refuse new connections. IMHO it already is. A node requests a filter using filterinit by specifying the false positive rate it wants and a guessed number of items. The node which will have to hold that filter then responds with the closest filter to what the SPV node requested that it is willing to provide. If the SPV node responds with a filterload command, it has accepted the offer, otherwise it will simply disconnect and find a better full node. I'd much rather have an overloaded node respond with 50% fp rate filters as an option if there aren't many full nodes available than simply disconnect SPV clients. At least thats my thinking, but you may be right that it is too heavy for too little gain. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-15 14:39 ` Matt Corallo @ 2012-06-16 8:27 ` Mike Hearn 2012-06-19 19:09 ` Matt Corallo 0 siblings, 1 reply; 21+ messages in thread From: Mike Hearn @ 2012-06-16 8:27 UTC (permalink / raw) To: Matt Corallo; +Cc: Bitcoin Development > I'd much rather have an overloaded node respond with 50% fp rate filters > as an option if there aren't many full nodes available than simply > disconnect SPV clients. I don't think the bloom filter settings have any impact on server-side load ... a node still has to check every transaction against the filter regardless of how that filter is configured, which means the same amount of disk io and processing. How can you reduce load on a peer by negotiating different filter settings? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-16 8:27 ` Mike Hearn @ 2012-06-19 19:09 ` Matt Corallo 2012-07-21 11:45 ` Mike Hearn 0 siblings, 1 reply; 21+ messages in thread From: Matt Corallo @ 2012-06-19 19:09 UTC (permalink / raw) To: Mike Hearn; +Cc: Bitcoin Development On Sat, 2012-06-16 at 10:27 +0200, Mike Hearn wrote: > > I'd much rather have an overloaded node respond with 50% fp rate filters > > as an option if there aren't many full nodes available than simply > > disconnect SPV clients. > > I don't think the bloom filter settings have any impact on server-side > load ... a node still has to check every transaction against the > filter regardless of how that filter is configured, which means the > same amount of disk io and processing. > > How can you reduce load on a peer by negotiating different filter settings? Agreed, I was largely giving a reason why one may want to negotiate the filter settings in response to your question as to why it was done. As long as there are sane limits (you cant make a 1GB filter by specifying 0% fp and some crazy number of entires), filter negotiation largely isnt worth it (also prevents any floats from appearing in the p2p protocol, though in either case it shouldn't be able to cause issues). Matt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-19 19:09 ` Matt Corallo @ 2012-07-21 11:45 ` Mike Hearn 2012-07-23 7:54 ` Andreas Petersson 0 siblings, 1 reply; 21+ messages in thread From: Mike Hearn @ 2012-07-21 11:45 UTC (permalink / raw) To: Matt Corallo; +Cc: Bitcoin Development One thing that occurred to me recently is that it'd be useful if filters could contain exact matches as well as Bloom filters. Specifically I'm thinking of things like my bond network proposal where some outputs may be marked as special using script fragments like "BOND" <data or hash of data> 2DROP. This would allow systems that are only interested in data and transactions relevant to bonds to exact-filter the chain on that marker, and then when a transaction is discovered, add the hash of that transaction to a parallel Bloom filter, ensuring you can see any transactions that connect to it. The spec as provided by Jeff doesn't specify how filters are matched against transactions. I propose the following algorithm: For each TX: - Check if the hash of the tx itself matches the filter - For each input: - For each script data element check if it is found in the filter - Check if the COutPoint.hash value is in the filter (let's you select txns that connect to arbitrary txns of interest) - For each output - For each script data element check if it is found in the filter ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-07-21 11:45 ` Mike Hearn @ 2012-07-23 7:54 ` Andreas Petersson 2012-07-23 16:40 ` Matt Corallo 2012-07-24 8:16 ` Mike Hearn 0 siblings, 2 replies; 21+ messages in thread From: Andreas Petersson @ 2012-07-23 7:54 UTC (permalink / raw) To: bitcoin-development Some concerns regarding Bloom Filters. I talked with Stefan Thomas on the Hackathon Berlin about this. I tried to follow the discussion closely but i have not taken a look at the code yet (is there already an implementation?) so please correct me if i got something wrong. The way the Bloom filters are planned now this requires a complicated setup. basically the client will ask the server to replay the whole blockchain, but filtered. This is not optimal for the following reasons: This will require the server to do a full scan of his data and only filter out non-matching entries. Really lightweight clients (like Bitcoincard), clients with shared private keys (electrum-style), or brainwallets - will ask the following question quite often to "supernodes": Given my public keys/addresses, what is the list of unspent outputs. i think it would make sense to include such a command, instead or in addition to the filterload/filterinit. And perhaps more severe: as far as i understand classic bloom filters, the server has no method of indexing his data for the expected requests. There is simply no data structure (or maybe it has to be invented) which allows the use of an index for queries by bloom filters of *varying length* and a generic hashing method. im not sure what a really efficient data structure for these kinds of query is. but i think it should be possible to find a good compromise between query flexibility, server load, client privacy. one possible scheme, looks like this: the client takes his list of addesses he is interested in. he hashes all of them to a fixed-length bit array (bloom filter) of length 64KiB (for example), and combines them with | to add more 1's with each address. the server maintains a binary tree data structure of unspent outputs arranged by the Bloom filter bits. to build the tree, the server would need to calculate the 64KiB bits for each address and arrange them in a binary tree. that way he can easily traverse the tree for a given bloom query. if a client whishes to query more broadly he can calculate the bloom filter to 64KiB and after that fill up the last 50% of the Bits with 1. or 95%. the trailing 1 bits even don't need to be transmitted to the server when a client is querying. of course, if the client is more privacy-concerned he could also fill up random bits with 1, which would not change much actually. the value of 64KiB is just out of thin air. according to my experimentation using BloomFilter from Guava - currently, also 8KiB would be sufficient to hava a 3% false positive rate for the 40000 active addresses we have right now. someone more familiar with hashing should please give his opinion if cutting a bloom filter in half has any bad consequences. Andreas ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-07-23 7:54 ` Andreas Petersson @ 2012-07-23 16:40 ` Matt Corallo 2012-07-24 8:16 ` Mike Hearn 1 sibling, 0 replies; 21+ messages in thread From: Matt Corallo @ 2012-07-23 16:40 UTC (permalink / raw) To: bitcoin-development On Mon, 2012-07-23 at 09:54 +0200, Andreas Petersson wrote: > Some concerns regarding Bloom Filters. I talked with Stefan Thomas on > the Hackathon Berlin about this. > I tried to follow the discussion closely but i have not taken a look at > the code yet (is there already an implementation?) so please correct me > if i got something wrong. AFAIK there was no implementation. I pushed one for bitcoinj+bitcoind today that compiles, but I haven't tested it much further (though its really quite a simple implementation): https://github.com/TheBlueMatt/bitcoin/commits/bloom http://code.google.com/r/bluemattme-bitcoinj/source/list?name=bloomfilter > The way the Bloom filters are planned now this requires a complicated > setup. basically the client will ask the server to replay the whole > blockchain, but filtered. > This is not optimal for the following reasons: > This will require the server to do a full scan of his data and only > filter out non-matching entries. My implementation has yet to implement block filtering, for now its only tx inv filtering. However, its really not that complicated and doing a scan of any individual transaction is very fast. So during the download phase, it really isn't much of any extra load on block chain providers (aside from having to load inputs in the current implementation, but that could be optimized some). > Really lightweight clients (like Bitcoincard), clients with shared > private keys (electrum-style), or brainwallets - will ask the following > question quite often to "supernodes": Given my public keys/addresses, > what is the list of unspent outputs. i think it would make sense to > include such a command, instead or in addition to the filterload/filterinit. > And perhaps more severe: as far as i understand classic bloom filters, > the server has no method of indexing his data for the expected requests. > There is simply no data structure (or maybe it has to be invented) which > allows the use of an index for queries by bloom filters of *varying > length* and a generic hashing method. > im not sure what a really efficient data structure for these kinds of > query is. but i think it should be possible to find a good compromise > between query flexibility, server load, client privacy. > one possible scheme, looks like this: > the client takes his list of addesses he is interested in. he hashes all > of them to a fixed-length bit array (bloom filter) of length 64KiB (for > example), and combines them with | to add more 1's with each address. > the server maintains a binary tree data structure of unspent outputs > arranged by the Bloom filter bits. > to build the tree, the server would need to calculate the 64KiB bits for > each address and arrange them in a binary tree. that way he can easily > traverse the tree for a given bloom query. > if a client whishes to query more broadly he can calculate the bloom > filter to 64KiB and after that fill up the last 50% of the Bits with 1. > or 95%. the trailing 1 bits even don't need to be transmitted to the > server when a client is querying. of course, if the client is more > privacy-concerned he could also fill up random bits with 1, which would > not change much actually. > > the value of 64KiB is just out of thin air. > according to my experimentation using BloomFilter from Guava - > currently, also 8KiB would be sufficient to hava a 3% false positive > rate for the 40000 active addresses we have right now. > > someone more familiar with hashing should please give his opinion if > cutting a bloom filter in half has any bad consequences. > > Andreas Though I like the idea of having a "give me all unspent outputs for my pubkeys" command, I see quite a future for clients somewhere between "I store nothing but pubkeys and don't know about the chain" and "I store a full chain" and the bloom filters as described are pretty useful for many clients in that in between. Also, for clients that are "Really lightweight clients" (given that they don't know about the chain) should probably just stick with an electrum-style server-client system with specialized servers (IMHO) instead of relying on P2P nodes to provide them with a list of unspent outputs/etc. In response to Mike's what-the-filter-should-match: The way it is now, it just checks each input+output to see if the hash160 of the dest addr, hash160 of the pubkey or hash160 of the p2sh sh matches the filter as-is. From the list provided, I think adding a check to allow adding a specific outpoint to a filter would be nice. However, in terms of data elements in txes, Im not so sure. Its by no means a bad idea, and it wouldnt be too much overhead, but making filters match a very broad set of criteria seems like a bit much to me, but maybe others disagree? Matt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-07-23 7:54 ` Andreas Petersson 2012-07-23 16:40 ` Matt Corallo @ 2012-07-24 8:16 ` Mike Hearn 1 sibling, 0 replies; 21+ messages in thread From: Mike Hearn @ 2012-07-24 8:16 UTC (permalink / raw) To: Andreas Petersson; +Cc: bitcoin-development > Really lightweight clients (like Bitcoincard), clients with shared > private keys (electrum-style), or brainwallets - will ask the following > question quite often to "supernodes": Given my public keys/addresses, > what is the list of unspent outputs. i think it would make sense to > include such a command, instead or in addition to the filterload/filterinit. Ultra-lightweight clients like Electrum or smart cards have a fundamentally different security model to SPV clients, which mean they cannot connect directly to the P2P network no matter what commands or db indexes are added. This seems to be a common point of confusion. Andreas brought up something similar in a chat yesterday. To connect to the P2P network, you MUST understand how to walk the block chain and handle re-orgs. This is not optional. The reason is that you are connected to random arbitrary nodes who can and maybe will lie to you. The block chain is a self-proving data structure, a node cannot lie about it or make you believe garbage unless they can outrun the rest of the miners combined. If all you're doing is asking a remote node to tell you about what coins are available, that node can simply say "guess what, you're a millionaire!" and you have no way to discover it's wrong. This can be dangerous in the case where you think you've received a payment but actually did not, eg, because your internet connection got tampered with in some way. SPV clients have the same issue for zero-confirmed transactions, but once you see confirmations at high speeds you can be pretty sure the network accepted the transaction. For clients that don't understand the block chain confirmations don't have any meaning. That's why Electrum requires a trusted server and connects to it via SSL. > And perhaps more severe: as far as i understand classic bloom filters, > the server has no method of indexing his data for the expected requests. It doesn't matter. CPU wise Bloom filtering of blocks is very cheap and can be trivially parallelised in the unlikely event it's necessary. The expensive part of serving a Bloom filtered chain to an SPV client is simply moving the disk head into the right position and waiting for the platter to rotate. Blocks are stored sequentially and modern hard disks transfer data once positioned at gigabit speeds so requesting 1 or 2000 blocks is not significantly different. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-14 11:52 ` Mike Hearn 2012-06-15 11:52 ` Mike Hearn 2012-06-15 13:19 ` Matt Corallo @ 2012-06-15 13:26 ` Jeff Garzik 2012-06-15 13:43 ` Mike Hearn 2012-06-15 15:43 ` Simon Barber 3 siblings, 1 reply; 21+ messages in thread From: Jeff Garzik @ 2012-06-15 13:26 UTC (permalink / raw) To: Mike Hearn; +Cc: Bitcoin Development On Thu, Jun 14, 2012 at 7:52 AM, Mike Hearn <mike@plan99.net> wrote: >> filterinit(false positive rate, number of elements): initialize >> filterload(data): input a serialized bloom filter table metadata and data. > > Why not combine these two? This is a fair point that sipa raised. Consensus concluded that 'filterload' includes all necessary metadata required to initialize a bloom filter. That implies 'filterinit' would only be needed for 'filteradd'. If we don't think 'filteradd' has a compelling use case, filterinit + filteradd can be dropped. >> 'filterload' and 'filteradd' enable special behavior changes for >> 'mempool' and existing P2P commands, whereby only transactions >> matching the bloom filter will be announced to the connection, and >> only matching transactions will be sent inside serialized blocks. > > Need to specify the format of how these arrive. It means that when a > new block is found instead of inv<->getdata<->block we'd see something > like inv<->getdata<->merkleblock where a "merkleblock" structure is a > header + list of transactions + list of merkle branches linking them > to the root. I think CMerkleTx already knows how to serialize this, > but it redundantly includes the block hash which would not be > necessary for a merkleblock message. Yes, the format is something that must be hashed out (no pun intended). Need input from potential users about what information they might need. -- Jeff Garzik exMULTI, Inc. jgarzik@exmulti.com ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-15 13:26 ` Jeff Garzik @ 2012-06-15 13:43 ` Mike Hearn 2012-06-15 14:56 ` Matt Corallo ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Mike Hearn @ 2012-06-15 13:43 UTC (permalink / raw) To: Jeff Garzik; +Cc: Bitcoin Development > Yes, the format is something that must be hashed out (no pun > intended). Need input from potential users about what information > they might need. Matts point that a branch-per-transaction may duplicate data is well made, that said, I suspect a format that tries to fix this would be much more complicated. How about see this project as a three part change? First step - add the mempool command and make nodes sync up their mempools on startup. Second step - if protocol version >= X, the "block" message consists of a header + num transactions + vector<hash> instead of the full transactions themselves. On receiving such a block, we go look to see which transactions we're missing from the mempool and request them with getdata. Each time we receive a tx message we check to see if it was one we were missing from a block. Once all transactions in the block message are in memory, we go ahead and assemble the block, then verify as per normal. This should speed up block propagation. Miners have an incentive to upgrade because it should reduce wasted work. Third step - new message, getmerkletx takes a vector<hash> and returns a merkletx message: "merkle branch missing the root + transaction data itself" for each requested transaction. The filtering commands are added, so the block message now only lists transaction hashes that match the filter which can then be requested with getmerkletx. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-15 13:43 ` Mike Hearn @ 2012-06-15 14:56 ` Matt Corallo 2012-06-15 15:32 ` Jeff Garzik 2012-06-15 18:42 ` Amir Taaki 2 siblings, 0 replies; 21+ messages in thread From: Matt Corallo @ 2012-06-15 14:56 UTC (permalink / raw) To: Mike Hearn; +Cc: Bitcoin Development On Fri, 2012-06-15 at 15:43 +0200, Mike Hearn wrote: > > Yes, the format is something that must be hashed out (no pun > > intended). Need input from potential users about what information > > they might need. > > Matts point that a branch-per-transaction may duplicate data is well > made, that said, I suspect a format that tries to fix this would be > much more complicated. > > How about see this project as a three part change? > > First step - add the mempool command and make nodes sync up their > mempools on startup. ACK > > Second step - if protocol version >= X, the "block" message consists > of a header + num transactions + vector<hash> instead of the full > transactions themselves. If vector<hash> is sorted in the order of the merkle tree, you dont need to forward the merkle tree to non-filtered nodes, further saving some small amount of bandwidth. For filtered nodes, you would still need to forward merkle branches anyway. > > On receiving such a block, we go look to see which transactions we're > missing from the mempool and request them with getdata. Each time we > receive a tx message we check to see if it was one we were missing > from a block. Once all transactions in the block message are in > memory, we go ahead and assemble the block, then verify as per normal. > This should speed up block propagation. Miners have an incentive to > upgrade because it should reduce wasted work. ACK > > Third step - new message, getmerkletx takes a vector<hash> and returns > a merkletx message: "merkle branch missing the root + transaction data > itself" for each requested transaction. The filtering commands are > added, so the block message now only lists transaction hashes that > match the filter which can then be requested with getmerkletx. I really dont think it would be /that/ difficult to make it getmerkletxs vector<hashes>. And then respond with a partial merkle tree to those transactions. Matt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-15 13:43 ` Mike Hearn 2012-06-15 14:56 ` Matt Corallo @ 2012-06-15 15:32 ` Jeff Garzik 2012-06-15 16:20 ` Matt Corallo 2012-06-15 18:42 ` Amir Taaki 2 siblings, 1 reply; 21+ messages in thread From: Jeff Garzik @ 2012-06-15 15:32 UTC (permalink / raw) To: Mike Hearn; +Cc: Bitcoin Development On Fri, Jun 15, 2012 at 9:43 AM, Mike Hearn <mike@plan99.net> wrote: > How about see this project as a three part change? > > First step - add the mempool command and make nodes sync up their > mempools on startup. Here's the "mempool" implementation: https://github.com/bitcoin/bitcoin/pull/1470 SPV nodes would definitely want to sync up their mempool upon startup. As for full nodes... I like the organic growth and random nature of the mempools. On the fence, WRT full node mempool sync at startup. -- Jeff Garzik exMULTI, Inc. jgarzik@exmulti.com ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-15 15:32 ` Jeff Garzik @ 2012-06-15 16:20 ` Matt Corallo 0 siblings, 0 replies; 21+ messages in thread From: Matt Corallo @ 2012-06-15 16:20 UTC (permalink / raw) To: Jeff Garzik; +Cc: Bitcoin Development On Fri, 2012-06-15 at 11:32 -0400, Jeff Garzik wrote: > As for full nodes... I like the organic growth and random nature of > the mempools. On the fence, WRT full node mempool sync at startup. > I dont particularly care either way, but I have a feeling miners will really want that so that they can get fee-paying transactions right away. Matt ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-15 13:43 ` Mike Hearn 2012-06-15 14:56 ` Matt Corallo 2012-06-15 15:32 ` Jeff Garzik @ 2012-06-15 18:42 ` Amir Taaki 2012-06-16 8:25 ` Mike Hearn 2 siblings, 1 reply; 21+ messages in thread From: Amir Taaki @ 2012-06-15 18:42 UTC (permalink / raw) To: bitcoin-development Why though? The bottleneck is not network traffic but disk space usage/blockchain validation time. ----- Original Message ----- From: Mike Hearn <mike@plan99.net> To: Jeff Garzik <jgarzik@exmulti.com> Cc: Bitcoin Development <bitcoin-development@lists.sourceforge.net> Sent: Friday, June 15, 2012 3:43 PM Subject: Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients > Yes, the format is something that must be hashed out (no pun > intended). Need input from potential users about what information > they might need. Matts point that a branch-per-transaction may duplicate data is well made, that said, I suspect a format that tries to fix this would be much more complicated. How about see this project as a three part change? First step - add the mempool command and make nodes sync up their mempools on startup. Second step - if protocol version >= X, the "block" message consists of a header + num transactions + vector<hash> instead of the full transactions themselves. On receiving such a block, we go look to see which transactions we're missing from the mempool and request them with getdata. Each time we receive a tx message we check to see if it was one we were missing from a block. Once all transactions in the block message are in memory, we go ahead and assemble the block, then verify as per normal. This should speed up block propagation. Miners have an incentive to upgrade because it should reduce wasted work. Third step - new message, getmerkletx takes a vector<hash> and returns a merkletx message: "merkle branch missing the root + transaction data itself" for each requested transaction. The filtering commands are added, so the block message now only lists transaction hashes that match the filter which can then be requested with getmerkletx. ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-15 18:42 ` Amir Taaki @ 2012-06-16 8:25 ` Mike Hearn 0 siblings, 0 replies; 21+ messages in thread From: Mike Hearn @ 2012-06-16 8:25 UTC (permalink / raw) To: Amir Taaki; +Cc: bitcoin-development The bottleneck for the android Bitcoin Wallet app is rapidly becoming bandwidth and parse time. On Fri, Jun 15, 2012 at 8:42 PM, Amir Taaki <zgenjix@yahoo.com> wrote: > Why though? The bottleneck is not network traffic but disk space usage/blockchain validation time. > > > > ----- Original Message ----- > From: Mike Hearn <mike@plan99.net> > To: Jeff Garzik <jgarzik@exmulti.com> > Cc: Bitcoin Development <bitcoin-development@lists.sourceforge.net> > Sent: Friday, June 15, 2012 3:43 PM > Subject: Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients > >> Yes, the format is something that must be hashed out (no pun >> intended). Need input from potential users about what information >> they might need. > > Matts point that a branch-per-transaction may duplicate data is well > made, that said, I suspect a format that tries to fix this would be > much more complicated. > > How about see this project as a three part change? > > First step - add the mempool command and make nodes sync up their > mempools on startup. > > Second step - if protocol version >= X, the "block" message consists > of a header + num transactions + vector<hash> instead of the full > transactions themselves. > > On receiving such a block, we go look to see which transactions we're > missing from the mempool and request them with getdata. Each time we > receive a tx message we check to see if it was one we were missing > from a block. Once all transactions in the block message are in > memory, we go ahead and assemble the block, then verify as per normal. > This should speed up block propagation. Miners have an incentive to > upgrade because it should reduce wasted work. > > Third step - new message, getmerkletx takes a vector<hash> and returns > a merkletx message: "merkle branch missing the root + transaction data > itself" for each requested transaction. The filtering commands are > added, so the block message now only lists transaction hashes that > match the filter which can then be requested with getmerkletx. > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-14 11:52 ` Mike Hearn ` (2 preceding siblings ...) 2012-06-15 13:26 ` Jeff Garzik @ 2012-06-15 15:43 ` Simon Barber 2012-06-15 16:40 ` Jeff Garzik 3 siblings, 1 reply; 21+ messages in thread From: Simon Barber @ 2012-06-15 15:43 UTC (permalink / raw) To: Mike Hearn; +Cc: Bitcoin Development separate filterinit / filterload - so you can do a new filterload later on if your list changes, without the privacy implications of filteradd. Simon On Thu 14 Jun 2012 04:52:29 AM PDT, Mike Hearn wrote: >> filterinit(false positive rate, number of elements): initialize >> filterload(data): input a serialized bloom filter table metadata and data. > > Why not combine these two? > >> 'filterload' and 'filteradd' enable special behavior changes for >> 'mempool' and existing P2P commands, whereby only transactions >> matching the bloom filter will be announced to the connection, and >> only matching transactions will be sent inside serialized blocks. > > Need to specify the format of how these arrive. It means that when a > new block is found instead of inv<->getdata<->block we'd see something > like inv<->getdata<->merkleblock where a "merkleblock" structure is a > header + list of transactions + list of merkle branches linking them > to the root. I think CMerkleTx already knows how to serialize this, > but it redundantly includes the block hash which would not be > necessary for a merkleblock message. > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Bitcoin-development mailing list > Bitcoin-development@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bitcoin-development ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [Bitcoin-development] New P2P commands for diagnostics, SPV clients 2012-06-15 15:43 ` Simon Barber @ 2012-06-15 16:40 ` Jeff Garzik 0 siblings, 0 replies; 21+ messages in thread From: Jeff Garzik @ 2012-06-15 16:40 UTC (permalink / raw) To: Simon Barber; +Cc: Bitcoin Development On Fri, Jun 15, 2012 at 11:43 AM, Simon Barber <simon@superduper.net> wrote: > separate filterinit / filterload - so you can do a new filterload later on > if your list changes, without the privacy implications of filteradd. filterload loads a whole new bloom filter from scratch, in one atomic operation. Params set, table sized, data input into table. A separate filterinit does not make sense for filterload. -- Jeff Garzik exMULTI, Inc. jgarzik@exmulti.com ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2012-07-24 8:16 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-06-13 20:46 [Bitcoin-development] New P2P commands for diagnostics, SPV clients Jeff Garzik 2012-06-14 11:52 ` Mike Hearn 2012-06-15 11:52 ` Mike Hearn 2012-06-15 13:19 ` Matt Corallo 2012-06-15 13:23 ` Mike Hearn 2012-06-15 14:39 ` Matt Corallo 2012-06-16 8:27 ` Mike Hearn 2012-06-19 19:09 ` Matt Corallo 2012-07-21 11:45 ` Mike Hearn 2012-07-23 7:54 ` Andreas Petersson 2012-07-23 16:40 ` Matt Corallo 2012-07-24 8:16 ` Mike Hearn 2012-06-15 13:26 ` Jeff Garzik 2012-06-15 13:43 ` Mike Hearn 2012-06-15 14:56 ` Matt Corallo 2012-06-15 15:32 ` Jeff Garzik 2012-06-15 16:20 ` Matt Corallo 2012-06-15 18:42 ` Amir Taaki 2012-06-16 8:25 ` Mike Hearn 2012-06-15 15:43 ` Simon Barber 2012-06-15 16:40 ` Jeff Garzik
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox