From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from sog-mx-1.v43.ch3.sourceforge.com ([172.29.43.191]
	helo=mx.sourceforge.net)
	by sfs-ml-2.v29.ch3.sourceforge.com with esmtp (Exim 4.76)
	(envelope-from <mh.in.england@gmail.com>) id 1YOqwR-0002Tp-IU
	for bitcoin-development@lists.sourceforge.net;
	Fri, 20 Feb 2015 17:00:39 +0000
Received-SPF: pass (sog-mx-1.v43.ch3.sourceforge.com: domain of gmail.com
	designates 74.125.82.178 as permitted sender)
	client-ip=74.125.82.178; envelope-from=mh.in.england@gmail.com;
	helo=mail-we0-f178.google.com; 
Received: from mail-we0-f178.google.com ([74.125.82.178])
	by sog-mx-1.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128)
	(Exim 4.76) id 1YOqwP-0006Kk-Uf
	for bitcoin-development@lists.sourceforge.net;
	Fri, 20 Feb 2015 17:00:39 +0000
Received: by wesw55 with SMTP id w55so6771088wes.4
	for <bitcoin-development@lists.sourceforge.net>;
	Fri, 20 Feb 2015 09:00:31 -0800 (PST)
MIME-Version: 1.0
X-Received: by 10.194.93.134 with SMTP id cu6mr20197460wjb.79.1424451246739;
	Fri, 20 Feb 2015 08:54:06 -0800 (PST)
Sender: mh.in.england@gmail.com
Received: by 10.194.188.11 with HTTP; Fri, 20 Feb 2015 08:54:06 -0800 (PST)
In-Reply-To: <CALqxMTE2doZjbsUxd-e09+euiG6bt_J=_BwKY_Ni3MNK6BiW1Q@mail.gmail.com>
References: <CALqxMTE2doZjbsUxd-e09+euiG6bt_J=_BwKY_Ni3MNK6BiW1Q@mail.gmail.com>
Date: Fri, 20 Feb 2015 17:54:06 +0100
X-Google-Sender-Auth: Vgd_ua7cO-txyia9as5FVPLV7Gk
Message-ID: <CANEZrP32M-hSU-a1DA5aTQXsx-6425sTeKW-m-cSUuXCYf+zuQ@mail.gmail.com>
From: Mike Hearn <mike@plan99.net>
To: Adam Back <adam@cypherspace.org>
Content-Type: multipart/alternative; boundary=047d7bb7092c8fd6b5050f87e621
X-Spam-Score: -0.5 (/)
X-Spam-Report: Spam Filtering performed by mx.sourceforge.net.
	See http://spamassassin.org/tag/ for more details.
	-1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for
	sender-domain
	0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider
	(mh.in.england[at]gmail.com)
	-0.0 SPF_PASS               SPF: sender matches SPF record
	1.0 HTML_MESSAGE           BODY: HTML included in message
	0.1 DKIM_SIGNED            Message has a DKIM or DK signature,
	not necessarily valid
	-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
X-Headers-End: 1YOqwP-0006Kk-Uf
Cc: Bitcoin Dev <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] bloom filtering, privacy
X-BeenThere: bitcoin-development@lists.sourceforge.net
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <bitcoin-development.lists.sourceforge.net>
List-Unsubscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=unsubscribe>
List-Archive: <http://sourceforge.net/mailarchive/forum.php?forum_name=bitcoin-development>
List-Post: <mailto:bitcoin-development@lists.sourceforge.net>
List-Help: <mailto:bitcoin-development-request@lists.sourceforge.net?subject=help>
List-Subscribe: <https://lists.sourceforge.net/lists/listinfo/bitcoin-development>,
	<mailto:bitcoin-development-request@lists.sourceforge.net?subject=subscribe>
X-List-Received-Date: Fri, 20 Feb 2015 17:00:39 -0000

--047d7bb7092c8fd6b5050f87e621
Content-Type: text/plain; charset=UTF-8

Hey Adam,


> Mike had posted a detailed response on the topic on why its complex
> and becomes bandwidth inefficient to improve it usefully.
>

To clarify, we *could* improve privacy and still preserve usefully high
performance, it's just a lot of complicated programming work. You need to
find out from the OS how much bandwidth you have to play with, for example,
and do all the very complex tracking to surf the wave and keep yourself in
roughly the right place.

The basic summary of which I think is that its not even intended to
> provide any practical privacy protection, its just about compacting
> the query for a set of addresses.
>

The original intent of Bloom filtering was to allow both. We want our cake
and we want to eat it.

The protocol can still do that, with sufficiently smart clients. The
problem is that being sufficiently smart in this regard has never come to
the top of the TODO list - users are always complaining about other things,
so those things are what gets priority.

It's not IMO a protocol issue per se. It's a code complexity and manpower
issue.


> Its seems surprising no one thought of it
> that way before (as it seems obvious when you hear it) but that seems
> to address the privacy issues as the user can fetch the block bloom
> filters and then scan it in complete privacy.


And then what? So you know the block matches. But with reasonable FP rates
every block will match at least a few transactions (this is already the
case - the FP rate is low but high enough that we get back FPs on nearly
every block). So you end up downloading every block? That won't work.

Eventually, wallets need to stop doing linear scans of the entire block
chain to find tx data. That worked fine when blocks were 10kb, it's still
working OK even though we scaled through two orders of magnitude, but we
can imagine that if we reach 10mb blocks then this whole approach will just
be too slow.

The main reason wallets are scanning the chain today (beyond lack of
protocol support for querying the UTXO set by script), is that they want to
show users time-ordered lists of transactions. Financial apps should show
you payment histories, everyone knows this, and without knowing roughly
when a tx happened and which inputs/outputs were mine, providing a useful
rendering is hard. Even with this data the UI is pretty useless, but at
least it's not actually missing.

By combining Subspace and BIP70 we can finally replace the payments list UI
with actual proper metadata that isn't extracted from the block chain, and
at that point non-scanning architectures become a lot more deployable.

--047d7bb7092c8fd6b5050f87e621
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div=
>Hey Adam,</div><div>=C2=A0</div><blockquote class=3D"gmail_quote" style=3D=
"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Mike had po=
sted a detailed response on the topic on why its complex<br>
and becomes bandwidth inefficient to improve it usefully.<br></blockquote><=
div><br></div><div>To clarify, we <i>could</i>=C2=A0improve privacy and sti=
ll preserve usefully high performance, it&#39;s just a lot of complicated p=
rogramming work. You need to find out from the OS how much bandwidth you ha=
ve to play with, for example, and do all the very complex tracking to surf =
the wave and keep yourself in roughly the right place.</div><div><br></div>=
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">The basic summary of which I think is that i=
ts not even intended to<br>
provide any practical privacy protection, its just about compacting<br>
the query for a set of addresses.<br></blockquote><div><br></div><div>The o=
riginal intent of Bloom filtering was to allow both. We want our cake and w=
e want to eat it.</div><div><br></div><div>The protocol can still do that, =
with sufficiently smart clients. The problem is that being sufficiently sma=
rt in this regard has never come to the top of the TODO list - users are al=
ways complaining about other things, so those things are what gets priority=
.</div><div><br></div><div>It&#39;s not IMO a protocol issue per se. It&#39=
;s a code complexity and manpower issue.</div><div>=C2=A0<br></div><blockqu=
ote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc s=
olid;padding-left:1ex">Its seems surprising no one thought of it<br>
that way before (as it seems obvious when you hear it) but that seems<br>
to address the privacy issues as the user can fetch the block bloom<br>
filters and then scan it in complete privacy.</blockquote><div><br></div><d=
iv>And then what? So you know the block matches. But with reasonable FP rat=
es every block will match at least a few transactions (this is already the =
case - the FP rate is low but high enough that we get back FPs on nearly ev=
ery block). So you end up downloading every block? That won&#39;t work.</di=
v><div><br></div><div>Eventually, wallets need to stop doing linear scans o=
f the entire block chain to find tx data. That worked fine when blocks were=
 10kb, it&#39;s still working OK even though we scaled through two orders o=
f magnitude, but we can imagine that if we reach 10mb blocks then this whol=
e approach will just be too slow.</div><div><br></div><div>The main reason =
wallets are scanning the chain today (beyond lack of protocol support for q=
uerying the UTXO set by script), is that they want to show users time-order=
ed lists of transactions. Financial apps should show you payment histories,=
 everyone knows this, and without knowing roughly when a tx happened and wh=
ich inputs/outputs were mine, providing a useful rendering is hard. Even wi=
th this data the UI is pretty useless, but at least it&#39;s not actually m=
issing.</div><div><br></div><div>By combining Subspace and BIP70 we can fin=
ally replace the payments list UI with actual proper metadata that isn&#39;=
t extracted from the block chain, and at that point non-scanning architectu=
res become a lot more deployable.</div></div></div></div>

--047d7bb7092c8fd6b5050f87e621--