From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <johanth@gmail.com>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 91B0192F
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Tue, 22 May 2018 09:23:33 +0000 (UTC)
X-Greylist: whitelisted by SQLgrey-1.7.6
Received: from mail-wm0-f65.google.com (mail-wm0-f65.google.com [74.125.82.65])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id E00082C6
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Tue, 22 May 2018 09:23:31 +0000 (UTC)
Received: by mail-wm0-f65.google.com with SMTP id m129-v6so31395791wmb.3
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Tue, 22 May 2018 02:23:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=mime-version:in-reply-to:references:from:date:message-id:subject:to
	:cc; bh=CQpWOdWC5TJOZ+fnY5HPAUByCinZwRcXEZ6Zm4uzRgI=;
	b=YgKrKEFoJ+DPYTLAIr88oONYMTmtMPbawHtYdCOJ9Ab/2qVpFu3hcPkriN4KwieOD7
	3ySBQ8PI817WNRHVYd9BXrlhML5QrB0oGlIhU1XWWbHP3ivIogR1wgb49mxxoB5UiaHa
	WHDWeYw4dS94bUvdkj7P+FX8MEhLv+Lz88m8a8l6u23tB2/Dnlams6+wq3Fj4S2UURqF
	76lmYQ0NBNnStgPbskEEPDLm4JY0cl0sSiXyORauuzbKURhwStwLk/0Pb6SKkm3bhee/
	nyWwLLqao1PQddO+UlphZXvuwsIYdPxEJzsa64uvQSTQ3mYrjDgH3VKhiQODj+o1WcOz
	vvUg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
	:message-id:subject:to:cc;
	bh=CQpWOdWC5TJOZ+fnY5HPAUByCinZwRcXEZ6Zm4uzRgI=;
	b=YZkmsJMz4kloGrJF1LMKTJifMhBAn8uBjWlNtofw9ni8MhZVp24S36agqWuIj63SyQ
	u7m9dX6zdcdoNq7xiQMMbyPlmgaTXKQ36ct97+B0ivsh5Min/spu4xx0oBTWwgCCCnCT
	7pn+f0ahPo5O3yfrgivly3ED3Z5KpVLI6MARllthGnUvMrR5sensGAtgjkBULclTA7DY
	Q1GnUA2WajgiGs2ylvVaZHIZ6OMGo8OWjD6Xar1hVV3ybOf3uP8F7ZoNm3ugU7iGZ+3P
	aYamG99faMwehhU/TVaVS2NUsEXwV/8eJ6zbF/4UqcigIdRphiZ5QrC/0mXESRJ8zdjC
	bQIw==
X-Gm-Message-State: ALKqPwc7aXMokPaGpDtSNmEs4FMyaT5ZrmLorBvyss63Amp/FC1zy7ed
	oxQaBK0gAeSe+Sy2TDP3OPxPI2bCyqoaWTwvhwo=
X-Google-Smtp-Source: AB8JxZpGp2iy8+AntF9zRHieIkV7/J3pw0vt0A+HVoZNi95GwhzEYtnJg0raZVkJxMIncWX3nn3XBIvnWVL8UZL7BNU=
X-Received: by 2002:a2e:5559:: with SMTP id
	j86-v6mr15288933ljb.147.1526981010346; 
	Tue, 22 May 2018 02:23:30 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a19:d04d:0:0:0:0:0 with HTTP; Tue, 22 May 2018 02:23:29
	-0700 (PDT)
In-Reply-To: <CAO3Pvs_MA4TtgCCu1NgCBjK2bZRN+rKnGQJN6m4yTrViBXRiPA@mail.gmail.com>
References: <d43c6082-1b2c-c95b-5144-99ad0021ea6c@mattcorallo.com>
	<CAAS2fgRF-MhOvpFY6c_qAPzNMo3GQ28RExdSbOV6Q6Oy2iWn1A@mail.gmail.com>
	<22d375c7-a032-8691-98dc-0e6ee87a4b08@mattcorallo.com>
	<CAAS2fgR3QRHeHEjjOS1ckEkL-h7=Na56G12hYW9Bmy9WEMduvg@mail.gmail.com>
	<CADZtCShLmH_k-UssNWahUNHgHvWQQ1y638LwaOfnJEipwjbiYg@mail.gmail.com>
	<CAAS2fgQLCN_cuZ-3QPjCLfYOtHfEk=SenTn5=y9LfGzJxLPR3Q@mail.gmail.com>
	<CADZtCSjYr6VMBVQ=rx44SgRWcFSXhVXUZJB=rHMh4X78Z2eY1A@mail.gmail.com>
	<CAO3Pvs9K3n=OzVQ06XGQvzNC+Aqp9S60kWM9VRPA8hWTJ3u9BQ@mail.gmail.com>
	<c23a5346-9f99-44f0-abbf-d7e7979bf1d8@gmail.com>
	<CAO3Pvs_MA4TtgCCu1NgCBjK2bZRN+rKnGQJN6m4yTrViBXRiPA@mail.gmail.com>
From: =?UTF-8?Q?Johan_Tor=C3=A5s_Halseth?= <johanth@gmail.com>
Date: Tue, 22 May 2018 11:23:29 +0200
Message-ID: <CAD3i26BibcaMdbQv-j+Egz_1y0GuhzepBp5ATNpj=Qv8hi1TVA@mail.gmail.com>
To: Olaoluwa Osuntokun <laolu32@gmail.com>
Content-Type: multipart/alternative; boundary="000000000000b3566d056cc7f8ba"
X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, HTML_MESSAGE,
	RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
X-Mailman-Approved-At: Tue, 22 May 2018 13:09:05 +0000
Cc: Bitcoin Protocol Discussion <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Tue, 22 May 2018 09:23:33 -0000

--000000000000b3566d056cc7f8ba
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

Maybe I didn't make it clear, but the distinction is that the current track
allocates
one service bit for each "filter type", where it has to be agreed upon up
front what
elements such a filter type contains.

My suggestion was to advertise a bitfield for each filter type the node
serves,
where the bitfield indicates what elements are part of the filters. This
essentially
removes the notion of decided filter types and instead leaves the decision
to
full-nodes.

This would require a "getcftypes" message, of course.

- Johan


On Tue, May 22, 2018 at 3:16 AM, Olaoluwa Osuntokun <laolu32@gmail.com>
wrote:

> > What if instead of trying to decide up front which subset of elements
> will
> > be most useful to include in the filters, and the size tradeoff, we let
> the
> > full-node decide which subsets of elements it serves filters for?
>
> This is already the case. The current "track" is to add new service bits
> (while we're in the uncommitted phase) to introduce new fitler types. Lig=
ht
> clients can then filter out nodes before even connecting to them.
>
> -- Laolu
>
> On Mon, May 21, 2018 at 1:35 AM Johan Tor=C3=A5s Halseth <johanth@gmail.c=
om>
> wrote:
>
>> Hi all,
>>
>> Most light wallets will want to download the minimum amount of data
>> required to operate, which means they would ideally download the smalles=
t
>> possible filters containing the subset of elements they need.
>>
>> What if instead of trying to decide up front which subset of elements
>> will be most useful to include in the filters, and the size tradeoff, we
>> let the full-node decide which subsets of elements it serves filters for=
?
>>
>> For instance, a full node would advertise that it could serve filters fo=
r
>> the subsets 110 (txid+script+outpoint), 100 (txid only), 011 (script+out=
point)
>> etc. A light client could then choose to download the minimal filter typ=
e
>> covering its needs.
>>
>> The obvious benefit of this would be minimal bandwidth usage for the
>> light client, but there are also some less obvious ones. We wouldn=E2=80=
=99t have
>> to decide up front what each filter type should contain, only the possib=
le
>> elements a filter can contain (more can be added later without breaking
>> existing clients). This, I think, would let the most served filter types
>> grow organically, with full-node implementations coming with sane defaul=
ts
>> for served filter types (maybe even all possible types as long as the
>> number of elements is small), letting their operator add/remove types at
>> will.
>>
>> The main disadvantage of this as I see it, is that there=E2=80=99s an ex=
ponential
>> blowup in the number of possible filter types in the number of element
>> types. However, this would let us start out small with only the elements=
 we
>> need, and in the worst case the node operators just choose to serve the
>> subsets corresponding to what now is called =E2=80=9Cregular=E2=80=9D + =
=E2=80=9Cextended=E2=80=9D filters
>> anyway, requiring no more resources.
>>
>> This would also give us some data on what is the most widely used filter
>> types, which could be useful in making the decision on what should be pa=
rt
>> of filters to eventually commit to in blocks.
>>
>> - Johan
>> On Sat, May 19, 2018 at 5:12, Olaoluwa Osuntokun via bitcoin-dev <
>> bitcoin-dev@lists.linuxfoundation.org> wrote:
>>
>> On Thu, May 17, 2018 at 2:44 PM Jim Posen via bitcoin-dev <bitcoin-
>>
>>> Monitoring inputs by scriptPubkey vs input-txid also has a massive
>>>> advantage for parallel filtering: You can usually known your pubkeys
>>>> well in advance, but if you have to change what you're watching block
>>>> N+1 for based on the txids that paid you in N you can't filter them
>>>> in parallel.
>>>>
>>>
>>> Yes, I'll grant that this is a benefit of your suggestion.
>>>
>>
>> Yeah parallel filtering would be pretty nice. We've implemented a serial
>> filtering for btcwallet [1] for the use-case of rescanning after a seed
>> phrase import. Parallel filtering would help here, but also we don't yet
>> take advantage of batch querying for the filters themselves. This would
>> speed up the scanning by quite a bit.
>>
>> I really like the filtering model though, it really simplifies the code,
>> and we can leverage identical logic for btcd (which has RPCs to fetch th=
e
>> filters) as well.
>>
>> [1]: https://github.com/Roasbeef/btcwallet/blob/master/chain/
>> neutrino.go#L180
>>
>> _______________________________________________ bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org https://lists.linuxfoundation.
>> org/mailman/listinfo/bitcoin-dev
>>
>>

--000000000000b3566d056cc7f8ba
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Maybe I didn&#39;t make it clear, but the distinction is t=
hat the current track allocates<div>one service bit for each &quot;filter t=
ype&quot;, where it has to be agreed upon up front what</div><div>elements =
such a filter type contains.</div><div><br></div><div>My suggestion was to =
advertise a bitfield for each filter type the node serves,</div><div>where =
the bitfield indicates what elements are part of the filters. This essentia=
lly</div><div>removes the notion of decided filter types and instead leaves=
 the decision to=C2=A0</div><div>full-nodes.</div><div><br></div><div>This =
would require a &quot;getcftypes&quot; message, of course.=C2=A0<br></div><=
div><br></div><div>- Johan=C2=A0</div><div><br></div><div><div class=3D"gma=
il_extra"><br><div class=3D"gmail_quote">On Tue, May 22, 2018 at 3:16 AM, O=
laoluwa Osuntokun <span dir=3D"ltr">&lt;<a href=3D"mailto:laolu32@gmail.com=
" target=3D"_blank">laolu32@gmail.com</a>&gt;</span> wrote:<br><blockquote =
class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid=
;padding-left:1ex"><div dir=3D"ltr"><span class=3D""><div>&gt; What if inst=
ead of trying to decide up front which subset of elements will</div><div>&g=
t; be most useful to include in the filters, and the size tradeoff, we let =
the</div><div>&gt; full-node decide which subsets of elements it serves fil=
ters for?</div><div><br></div></span><div>This is already the case. The cur=
rent &quot;track&quot; is to add new service bits</div><div>(while we&#39;r=
e in the uncommitted phase) to introduce new fitler types. Light</div><div>=
clients can then filter out nodes before even connecting to them.</div><div=
><br></div><div>-- Laolu</div><div><br></div><div class=3D"gmail_quote"><di=
v dir=3D"ltr">On Mon, May 21, 2018 at 1:35 AM Johan Tor=C3=A5s Halseth &lt;=
<a href=3D"mailto:johanth@gmail.com" target=3D"_blank">johanth@gmail.com</a=
>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0=
 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><img class=3D"m_766180=
4442707420502m_-8850601494583743300cloudmagic-smart-beacon" src=3D"https://=
tr.cloudmagic.com/h/v6/emailtag/tag/2.0/1526891728/b33cfe15f93c4412b9d3504f=
ec672697/2/b53988f48cda345112c1586d420d31cf/076962fa5a1f7737d8541e3d8681273=
b/903bb174d9946b90a95b4da487a91d69/newton.gif" style=3D"border:0;width:10px=
;height:10px" width=3D"10" height=3D"10" align=3D"right"><div><div class=3D=
"h5"><div dir=3D"auto"><span>Hi all,</span><div><span><br></span></div><div=
><span>Most light wallets will want to download the minimum amount of data =
required to operate, which means they would ideally download the smallest p=
ossible filters containing the subset of elements they need.=C2=A0</span></=
div><div><span><br></span></div><div><span>What if instead of trying to dec=
ide up front which subset of elements will be most useful to include in the=
 filters, and the size tradeoff, we let the full-node decide which subsets =
of elements it serves filters for?<br></span><span><br>For instance, a full=
 node would advertise that it could serve filters for the subsets 110 (txid=
+script+outpoint), 100 (txid only), 011 (</span>script+outpoint) etc. A lig=
ht client could then choose to download the minimal filter type covering it=
s needs.=C2=A0</div><div><br></div><div>The obvious benefit of this would b=
e minimal bandwidth usage for the light client, but there are also some les=
s obvious ones. We wouldn=E2=80=99t have to decide up front what each filte=
r type should contain, only the possible elements a filter can contain (mor=
e can be added later without breaking existing clients). This, I think, wou=
ld let the most served filter types grow organically, with full-node implem=
entations coming with sane defaults for served filter types (maybe even all=
 possible types as long as the number of elements is small), letting their =
operator add/remove types at will.</div><div><br></div><div>The main disadv=
antage of this as I see it, is that there=E2=80=99s an exponential blowup i=
n the number of possible filter types in the number of element types. Howev=
er, this would let us start out small with only the elements we need, and i=
n the worst case the node operators just choose to serve the subsets corres=
ponding to what now is called =E2=80=9Cregular=E2=80=9D + =E2=80=9Cextended=
=E2=80=9D filters anyway, requiring no more resources.</div><div><br></div>=
<div>This would also give us some data on what is the most widely used filt=
er types, which could be useful in making the decision on what should be pa=
rt of filters to eventually commit to in blocks.</div></div><div dir=3D"aut=
o"><div><br></div><div>- Johan</div></div><div dir=3D"auto"><div><div id=3D=
"m_7661804442707420502m_-8850601494583743300cm_replymail_content_wrap"><div=
 class=3D"m_7661804442707420502m_-8850601494583743300cm_replymail_content_1=
526889092_wrapper">On Sat, May 19, 2018 at 5:12, Olaoluwa Osuntokun via bit=
coin-dev &lt;<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" targe=
t=3D"_blank">bitcoin-dev@lists.<wbr>linuxfoundation.org</a>&gt; wrote:<br><=
/div></div></div></div><div dir=3D"auto"><div><div id=3D"m_7661804442707420=
502m_-8850601494583743300cm_replymail_content_wrap"><div class=3D"m_7661804=
442707420502m_-8850601494583743300cm_replymail_content_1526889092_wrapper">=
<div id=3D"m_7661804442707420502m_-8850601494583743300cm_replymail_content_=
1526889092" style=3D"overflow:visible"><blockquote style=3D"margin:0;border=
-left:#d6d6d6 1px solid;padding-left:10px"><div dir=3D"ltr"><div class=3D"g=
mail_quote"><div dir=3D"ltr">On Thu, May 17, 2018 at 2:44 PM Jim Posen via =
bitcoin-dev &lt;bitcoin- </div><blockquote class=3D"gmail_quote" style=3D"m=
argin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir=3D"l=
tr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">Monitoring inputs by scriptPubkey vs input-txid also has a ma=
ssive<br>
advantage for parallel filtering:  You can usually known your pubkeys<br>
well in advance, but if you have to change what you&#39;re watching block<b=
r>
 N+1 for based on the txids that paid you in N you can&#39;t filter them<br=
>
in parallel.<br></blockquote><div><br></div></div></div></div><div dir=3D"l=
tr"><div class=3D"gmail_extra"><div class=3D"gmail_quote"><div>Yes, I&#39;l=
l grant that this is a benefit of your suggestion.</div></div></div></div><=
/blockquote><div><br></div><div>Yeah parallel filtering would be pretty nic=
e. We&#39;ve implemented a serial filtering for btcwallet [1] for the use-c=
ase of rescanning after a seed phrase import. Parallel filtering would help=
 here, but also we don&#39;t yet take advantage of batch querying for the f=
ilters themselves. This would speed up the scanning by quite a bit. </div><=
div><br></div><div>I really like the filtering model though, it really simp=
lifies the code, and we can leverage identical logic for btcd (which has RP=
Cs to fetch the filters) as well. </div><div><br></div><div>[1]: <a href=3D=
"https://github.com/Roasbeef/btcwallet/blob/master/chain/neutrino.go#L180" =
target=3D"_blank">https://github.com/Roasbeef/<wbr>btcwallet/blob/master/ch=
ain/<wbr>neutrino.go#L180</a> </div></div></div>
<br></blockquote></div></div></div></div></div><div dir=3D"auto"><div><div =
id=3D"m_7661804442707420502m_-8850601494583743300cm_replymail_content_wrap"=
><div class=3D"m_7661804442707420502m_-8850601494583743300cm_replymail_cont=
ent_1526889092_wrapper"><div id=3D"m_7661804442707420502m_-8850601494583743=
300cm_replymail_content_1526889092" style=3D"overflow:visible"><blockquote =
style=3D"margin:0;border-left:#d6d6d6 1px solid;padding-left:10px">________=
______________________<wbr>_________________
bitcoin-dev mailing list
<a href=3D"mailto:bitcoin-dev@lists.linuxfoundation.org" target=3D"_blank">=
bitcoin-dev@lists.<wbr>linuxfoundation.org</a>
<a href=3D"https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev" =
target=3D"_blank">https://lists.linuxfoundation.<wbr>org/mailman/listinfo/b=
itcoin-<wbr>dev</a>
</blockquote></div></div></div></div></div></div></div></blockquote></div><=
/div>
</blockquote></div><br></div></div></div>

--000000000000b3566d056cc7f8ba--