From: Tamas Blummer <tamas.blummer@gmail.com>
To: Olaoluwa Osuntokun <laolu32@gmail.com>,
Bitcoin Protocol Discussion
<bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] BIP 158 Flexibility and Filter Size
Date: Thu, 31 May 2018 16:27:50 +0200 [thread overview]
Message-ID: <08A283BC-13E0-4A42-A13F-3F0EAAEF5440@gmail.com> (raw)
In-Reply-To: <CAO3Pvs-YDzfRqmyJ85wTH0ciccjCvkm5stGyP_tVGGna=PMv3A@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5046 bytes --]
I processed the historic blockchain to create a single filter populated with spent input scripts and output scripts. The Golomb parameter was P=2^20
The resulting chart shows a volatile history of same-block address re-use with a notable drops in relative filter size during the early history and in the time window where SatoshiDICE was popular, since then trending higher.
The history of only the last half year suggests a current filter size of between 2.0% - 2.5% of block sizes.
Since most outputs are spent within a short time period, but apparently not that often in same blocks, I think it was worth considering filter series that match over a windows of 2^n blocks (n=(0…10)). Applications could then bracket the
range of interest and then narrow down requesting finer filters or blocks.
Then I created 1600 random (P2SH) scripts and totaled the false positive block download data size if observing 100, 200, 400, 800, 1600 of them.
The result suggests that even for 1600 the false positive overhead is less than 0.1% of blockchain data size.
I agree with Greg that we should optimize the parameters for a small observed set as those will be running on mobile devices.
As of Pieter’s findings the simulation parameters were optimal for ca. 1000 observed scripts which is maybe to many for a “small” application.
On the other hand we do not know the needs of future popular mobile applications.
With parameters of the simulation the current minimal data burden on a mobile wallet would be ca. 0.1 GB / Month.
Simulations with other parameters could be executed using this patch branch: https://github.com/tamasblummer/rust-bitcoin-spv/tree/blockfilterstats A run takes a few hours on a fast machine with release build and local bitcoind.
The calculation can not be reduced to the recent history as the process builds in-memory utxo from genesis.
The result of executing the binary is a CSV file containing:
blocknumber, blocksize, utxo size, filter size, false positive data size for 100, false positive data size for 100, … false positive data size for 100
e.g:
524994,1112181,57166825,21556,0,0,0,0,1112181
Tamas Blummer
> On May 29, 2018, at 06:01, Olaoluwa Osuntokun via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
>
> > The additional benefit of the input script/outpoint filter is to watch for
> > unexpected spends (coins getting stolen or spent from another wallet) or
> > transactions without a unique change or output address. I think this is a
> > reasonable implementation, and it would be nice to be able to download that
> > filter without any input elements.
>
> As someone who's implemented a complete integration of the filtering
> technique into an existing wallet, and a higher application I disagree.
> There's not much gain to be had in splitting up the filters: it'll result in
> additional round trips (to fetch these distinct filter) during normal
> operation, complicate routine seed rescanning logic, and also is detrimental
> to privacy if one is fetching blocks from the same peer as they've
> downloaded the filters from.
>
> However, I'm now convinced that the savings had by including the prev output
> script (addr re-use and outputs spent in the same block as they're created)
> outweigh the additional booking keeping required in an implementation (when
> extracting the precise tx that matched) compared to using regular outpoint
> as we do currently. Combined with the recently proposed re-parametrization
> of the gcs parameters[1], the filter size should shrink by quite a bit!
>
> I'm very happy with the review the BIPs has been receiving as of late. It
> would've been nice to have this 1+ year ago when the draft was initially
> proposed, but better late that never!
>
> Based on this thread, [1], and discussions on various IRC channels, I plan
> to make the following modifications to the BIP:
>
> 1. use P=2^19 and M=784931 as gcs parameters, and also bind these to the
> filter instance, so future filter types may use distinct parameters
> 2. use the prev output script rather than the prev input script in the
> regular filter
> 3. remove the txid from the regular filter(as with some extra book-keeping
> the output script is enough)
> 4. do away with the extended filter all together, as our original use case
> for it has been nerfed as the filter size grew too large when doing
> recursive parsing. instead we watch for the outpoint being spent and
> extract the pre-image from it if it matches now
>
> The resulting changes should slash the size of the filters, yet still ensure
> that they're useful enough for our target use case.
>
> [1]: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-May/016029.html
>
> -- Laolu
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
[-- Attachment #2.1: Type: text/html, Size: 7130 bytes --]
[-- Attachment #2.2: filtersize.png --]
[-- Type: image/png, Size: 58848 bytes --]
[-- Attachment #2.3: lasthalfyear.png --]
[-- Type: image/png, Size: 50431 bytes --]
[-- Attachment #2.4: falsepositive.png --]
[-- Type: image/png, Size: 82663 bytes --]
next prev parent reply other threads:[~2018-05-31 14:27 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-17 15:25 [bitcoin-dev] BIP 158 Flexibility and Filter Size Matt Corallo
2018-05-17 15:43 ` Peter Todd
2018-05-17 15:46 ` Matt Corallo
2018-05-17 16:36 ` Gregory Maxwell
2018-05-17 16:59 ` Matt Corallo
2018-05-17 18:34 ` Gregory Maxwell
2018-05-17 18:34 ` Gregory Maxwell
2018-05-17 20:19 ` Jim Posen
2018-05-17 20:45 ` Gregory Maxwell
2018-05-17 21:27 ` Jim Posen
2018-05-19 3:12 ` Olaoluwa Osuntokun
2018-05-21 8:35 ` Johan Torås Halseth
2018-05-22 1:16 ` Olaoluwa Osuntokun
2018-05-22 9:23 ` Johan Torås Halseth
2018-05-23 0:42 ` Jim Posen
2018-05-23 7:38 ` Jim Posen
2018-05-23 8:16 ` Johan Torås Halseth
2018-05-23 17:28 ` Gregory Maxwell
2018-05-24 1:04 ` Conner Fromknecht
2018-05-24 3:48 ` Jim Posen
2018-05-28 18:18 ` Tamas Blummer
2018-05-28 18:28 ` Tamas Blummer
2018-05-28 19:24 ` Gregory Maxwell
2018-05-29 2:42 ` Jim Posen
2018-05-29 3:24 ` Gregory Maxwell
2018-05-29 4:01 ` Olaoluwa Osuntokun
2018-05-31 14:27 ` Tamas Blummer [this message]
2018-06-01 2:52 ` Olaoluwa Osuntokun
2018-06-01 4:15 ` Gregory Maxwell
[not found] ` <CAAS2fgSyVi0d_ixp-auRPPzPfFeffN=hsWhWT5=EzDO3O+Ue1g@mail.gmail.com>
2018-06-02 0:01 ` Olaoluwa Osuntokun
2018-06-02 0:22 ` Gregory Maxwell
2018-06-02 2:02 ` Jim Posen
2018-06-02 12:41 ` David A. Harding
2018-06-02 22:02 ` Tamas Blummer
2018-06-03 0:28 ` Gregory Maxwell
2018-06-03 5:14 ` Tamas Blummer
2018-06-03 6:11 ` Pieter Wuille
2018-06-03 16:44 ` Tamas Blummer
2018-06-03 16:50 ` Tamas Blummer
2018-06-08 5:03 ` Olaoluwa Osuntokun
2018-06-08 16:14 ` Gregory Maxwell
2018-06-08 23:35 ` Olaoluwa Osuntokun
2018-06-09 10:34 ` David A. Harding
2018-06-12 23:51 ` Olaoluwa Osuntokun
2018-06-09 15:45 ` Gregory Maxwell
2018-06-12 23:58 ` Olaoluwa Osuntokun
2018-05-18 8:46 ` Riccardo Casatta
2018-05-19 3:08 ` Olaoluwa Osuntokun
2018-05-19 2:57 ` Olaoluwa Osuntokun
2018-05-19 3:06 ` Pieter Wuille
2018-05-22 1:15 ` Olaoluwa Osuntokun
2018-05-18 6:28 ` Karl-Johan Alm
2018-06-04 8:42 ` Riccardo Casatta
2018-06-05 1:08 ` Jim Posen
2018-06-05 4:33 ` Karl-Johan Alm
2018-06-05 17:22 ` Jim Posen
2018-06-05 17:52 ` Gregory Maxwell
2018-06-06 1:12 ` Olaoluwa Osuntokun
2018-06-06 15:14 ` Riccardo Casatta
2018-05-19 2:51 ` Olaoluwa Osuntokun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=08A283BC-13E0-4A42-A13F-3F0EAAEF5440@gmail.com \
--to=tamas.blummer@gmail.com \
--cc=bitcoin-dev@lists.linuxfoundation.org \
--cc=laolu32@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox