From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from sog-mx-3.v43.ch3.sourceforge.com ([172.29.43.193] helo=mx.sourceforge.net) by sfs-ml-2.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1YP2NB-0002wS-NZ for bitcoin-development@lists.sourceforge.net; Sat, 21 Feb 2015 05:13:01 +0000 Received-SPF: pass (sog-mx-3.v43.ch3.sourceforge.com: domain of gmail.com designates 209.85.216.53 as permitted sender) client-ip=209.85.216.53; envelope-from=adam.back@gmail.com; helo=mail-qa0-f53.google.com; Received: from mail-qa0-f53.google.com ([209.85.216.53]) by sog-mx-3.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) id 1YP2NA-0007cL-HF for bitcoin-development@lists.sourceforge.net; Sat, 21 Feb 2015 05:13:01 +0000 Received: by mail-qa0-f53.google.com with SMTP id k15so15481475qaq.12 for ; Fri, 20 Feb 2015 21:12:55 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.140.135.207 with SMTP id 198mr2360933qhh.71.1424495574978; Fri, 20 Feb 2015 21:12:54 -0800 (PST) Sender: adam.back@gmail.com Received: by 10.96.150.233 with HTTP; Fri, 20 Feb 2015 21:12:54 -0800 (PST) In-Reply-To: References: Date: Sat, 21 Feb 2015 05:12:54 +0000 X-Google-Sender-Auth: 5Lpo1jZF9_tIJWgos5iGPwWEnBE Message-ID: From: Adam Back To: Mike Hearn Content-Type: text/plain; charset=UTF-8 X-Spam-Score: -1.5 (-) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (adam.back[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature X-Headers-End: 1YP2NA-0007cL-HF Cc: Bitcoin Dev Subject: Re: [Bitcoin-development] bloom filtering, privacy X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 Feb 2015 05:13:01 -0000 Seems like Greg & I may be saying different things. Maybe I am misunderstanding something at the wire level or in size/scalability but what I was trying to say is I think simpler. By UTXO commitment I mean a merkle tree of unspent addresses is committed in each block. Also a bloom filter containing addresses in the block is committed. Now the user downloads the bloom filter for each block, and searches it locally. They see which addresses of theirs maybe in the block (with some false positives). Then they make fresh random connections to different nodes and request download of the respective individual transactions from the full node. The node can respond either a) here is the transaction and here is its merkle path in the merkle tree (this is the way SPV works today); or b) there is no such transaction, this is a false positive, and here is a pair of merkle trie paths in the UTXO commitment (a trie) that proves the full node is not withholding and its true that no such transaction is in the block. Additionally with UTXO commitments in case a) the user can keep up to date with the chain tip and request from the full node a merkle path in the UTXO commitment to show that the coin is still unspent. (Otherwise you get long range attacks where someone can grind away until they belatedly find a PoW on an old low hashrate block with UTXO and fool an SPV node they know the address for into accepting a spend of something long spent). About privacy the node can make different random connections to different nodes to fetch addresses. Nothing is leaked by downloading the bloom filter. Scanning happens locally. The full node cant correlate the addresses as belonging to the same person by correlating the download requests for them, because they are made via different nodes. Its not a surprise nor privacy revealing that someone would want to check receipt of the funds. The limit is the interactive nature of ToR which isnt very secure against pervasive monitoring. But for basic full-node passive attack resistant privacy pretty good. Contrast with increasing the false positive on bloom queries: here the full node can test correlation (modulo the false positive ratio) and combine that with network flow analysis to narrow down who the user might be. Plus query size and privacy are in conflict. Plus the query size has to be continually retuned to even create a reliable false positive rate relative to the current UTXO set. Is that is even happening now (other than parameter sets)? About the bitmap: >Greg Maxwell wrote: >> there is a scriptpubkey bitmap per block >> which is committed. Users can request the map, and be SPV confident >> that they received a faithful one. If there are hits, they can request >> the block and be confident there was no censoring. how does the SPV client know what the bits in this map mean to scan? I presume these would be one bit per address and one would need to know the full UTXO set in order to know whats in there. I am not sure an SPV node would want the hit of keeping up to date with the full UTXO set? s/address/scriptpubkey for accuracy) Adam On 20 February 2015 at 19:03, Mike Hearn wrote: >> It's a straight forward idea: there is a scriptpubkey bitmap per block >> which is committed. Users can request the map, and be SPV confident >> that they received a faithful one. If there are hits, they can request >> the block and be confident there was no censoring. > > > OK, I see now, thanks Gregory. You're right, the use of UTXO set in that > context was confusing me. > > If I go back to when we first did Bloom filtering and imagine the same > proposal being made, I guess I would have been wondering about the following > issues. Perhaps they have solutions: > > 1. Miners have to upgrade to generate the per-block filters. Any block that > doesn't have such a filter has to be downloaded in full, still. So it would > have taken quite a while for the bandwidth savings to materialise. > > 2. If checking the filter isn't a consensus rule, any miner who builds a > wrong filter breaks the entire SPV userbase. With per-node filtering, a > malicious or wrong node breaks an individual sync, but if the wallet user > notices they don't seem to be properly synchronised they can just press > "Rescan chain" and most likely get fixed. In practice broken nodes have > never been reported, but it's worth considering. > > 3. Downloading full blocks is still a lot of data. If you have a wallet that > receives tips a couple of times per day, and you open your wallet once per > week, then with 1mb blocks you would be downloading ~14mb of data each time. > Pretty pokey even on a desktop. Much sadness if you're on mobile. > > 4. Header size is constant, but per-block filters wouldn't be. So even the > null sync would download more data as the network got busier. Of course > Bloom filtering has the same scaling problem, but only between hard disk -> > RAM -> CPU rather than across the network. > > 5. Is it really more private? Imagine we see a hit in block 100, so we > download the full block and find our transaction. But now we must also learn > when that transaction is spent, so we can keep the wallet-local UTXO set up > to date. So we scan forward and see another hit in block 105, so now we > download block 105. The remote node can now select all transactions in block > 105 that spend transactions in block 100 and narrow down which txs are ours. > If we are watching a wallet that does multi-sends then it feels like this > problem gets much worse. > > > > I'd really like to find a solution that has O(1) scaling on sync. If we see > nodes just as sources of UTXO data then shovelling the client (tx, existing > merkle path) pairs keyed off script prefixes would (with one additional > index) be O(1) and provide the same security guarantees as Bloom filtering > today. It creates lots of other problems to solve, but at least it would > scale into the forseeable future. > >