public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
* [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients
@ 2017-06-01 19:01 Olaoluwa Osuntokun
  2017-06-01 21:00 ` Eric Lombrozo
                   ` (6 more replies)
  0 siblings, 7 replies; 39+ messages in thread
From: Olaoluwa Osuntokun @ 2017-06-01 19:01 UTC (permalink / raw)
  To: Arnoud Kouwenhoven - Pukaki Corp via bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 5087 bytes --]

Hi y'all,

Alex Akselrod and I would like to propose a new light client BIP for
consideration:
   * https://github.com/Roasbeef/bips/blob/master/gcs_light_client.mediawiki

This BIP proposal describes a concrete specification (along with a
reference implementations[1][2][3]) for the much discussed client-side
filtering reversal of BIP-37. The precise details are described in the
BIP, but as a summary: we've implemented a new light-client mode that uses
client-side filtering based off of Golomb-Rice coded sets. Full-nodes
maintain an additional index of the chain, and serve this compact filter
(the index) to light clients which request them. Light clients then fetch
these filters, query the locally and _maybe_ fetch the block if a relevant
item matches. The cool part is that blocks can be fetched from _any_
source, once the light client deems it necessary. Our primary motivation
for this work was enabling a light client mode for lnd[4] in order to
support a more light-weight back end paving the way for the usage of
Lightning on mobile phones and other devices. We've integrated neutrino
as a back end for lnd, and will be making the updated code public very
soon.

One specific area we'd like feedback on is the parameter selection. Unlike
BIP-37 which allows clients to dynamically tune their false positive rate,
our proposal uses a _fixed_ false-positive. Within the document, it's
currently specified as P = 1/2^20. We've done a bit of analysis and
optimization attempting to optimize the following sum:
filter_download_bandwidth + expected_block_false_positive_bandwidth. Alex
has made a JS calculator that allows y'all to explore the affect of
tweaking the false positive rate in addition to the following variables:
the number of items the wallet is scanning for, the size of the blocks,
number of blocks fetched, and the size of the filters themselves. The
calculator calculates the expected bandwidth utilization using the CDF of
the Geometric Distribution. The calculator can be found here:
https://aakselrod.github.io/gcs_calc.html. Alex also has an empirical
script he's been running on actual data, and the results seem to match up
rather nicely.

We we're excited to see that Karl Johan Alm (kallewoof) has done some
(rather extensive!) analysis of his own, focusing on a distinct encoding
type [5]. I haven't had the time yet to dig into his report yet, but I
think I've read enough to extract the key difference in our encodings: his
filters use a binomial encoding _directly_ on the filter contents, will we
instead create a Golomb-Coded set with the contents being _hashes_ (we use
siphash) of the filter items.

Using a fixed fp=20, I have some stats detailing the total index size, as
well as averages for both mainnet and testnet. For mainnet, using the
filter contents as currently described in the BIP (basic + extended), the
total size of the index comes out to 6.9GB. The break down is as follows:

    * total size:  6976047156
    * total avg:  14997.220622758816
    * total median:  3801
    * total max:  79155
    * regular size:  3117183743
    * regular avg:  6701.372750217131
    * regular median:  1734
    * regular max:  67533
    * extended size:  3858863413
    * extended avg:  8295.847872541684
    * extended median:  2041
    * extended max:  52508

In order to consider the average+median filter sizes in a world worth
larger blocks, I also ran the index for testnet:

    * total size:  2753238530
    * total avg:  5918.95736054141
    * total median:  60202
    * total max:  74983
    * regular size:  1165148878
    * regular avg:  2504.856172982827
    * regular median:  24812
    * regular max:  64554
    * extended size:  1588089652
    * extended avg:  3414.1011875585823
    * extended median:  35260
    * extended max:  41731

Finally, here are the testnet stats which take into account the increase
in the maximum filter size due to segwit's block-size increase. The max
filter sizes are a bit larger due to some of the habitual blocks I
created last year when testing segwit (transactions with 30k inputs, 30k
outputs, etc).

     * total size:  585087597
     * total avg:  520.8839608674402
     * total median:  20
     * total max:  164598
     * regular size:  299325029
     * regular avg:  266.4790836307566
     * regular median:  13
     * regular max:  164583
     * extended size:  285762568
     * extended avg:  254.4048772366836
     * extended median:  7
     * extended max:  127631

For those that are interested in the raw data, I've uploaded a CSV file
of raw data for each block (mainnet + testnet), which can be found here:
     * mainnet: (14MB):
https://www.dropbox.com/s/4yk2u8dj06njbuv/mainnet-gcs-stats.csv?dl=0
     * testnet: (25MB):
https://www.dropbox.com/s/w7dmmcbocnmjfbo/gcs-stats-testnet.csv?dl=0


We look forward to getting feedback from all of y'all!

-- Laolu


[1]: https://github.com/lightninglabs/neutrino
[2]: https://github.com/Roasbeef/btcd/tree/segwit-cbf
[3]: https://github.com/Roasbeef/btcutil/tree/gcs/gcs
[4]: https://github.com/lightningnetwork/lnd/

-- Laolu

[-- Attachment #2: Type: text/html, Size: 7088 bytes --]

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2017-11-09 23:44 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-01 19:01 [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for Light Clients Olaoluwa Osuntokun
2017-06-01 21:00 ` Eric Lombrozo
2017-06-01 21:33 ` Matt Corallo
2017-06-01 22:10   ` Olaoluwa Osuntokun
2017-06-02  2:15     ` Chris
2017-06-02  2:28       ` Gregory Maxwell
2017-06-02  3:35         ` Alex Akselrod
2017-06-02 16:07         ` Chris Pacia
2017-06-02  4:49 ` Olaoluwa Osuntokun
2017-06-09  3:59   ` Olaoluwa Osuntokun
2017-11-09 23:44     ` Olaoluwa Osuntokun
2017-06-02  6:00 ` Karl Johan Alm
     [not found]   ` <CAE0pnx+RRAP269VeWAcxKbrcS9qX4LS8_6nY_js8X5NtQ22t_A@mail.gmail.com>
     [not found]     ` <CAE0pnxLKYnwHnktTqW949s1AA9uK=6WnVYWmRoau8B1SszzYEg@mail.gmail.com>
     [not found]       ` <CAE0pnxJxHYQ4+2pt3tt=1WZ0-K0vDxGB4KBXY+R=WfktMmATwA@mail.gmail.com>
2017-06-02 17:55         ` Alex Akselrod
2017-06-05  2:06           ` Karl Johan Alm
2017-06-09  3:03             ` Olaoluwa Osuntokun
2017-06-07 21:41 ` Gregory Maxwell
2017-06-09  3:42   ` Olaoluwa Osuntokun
2017-06-09  4:47     ` Olaoluwa Osuntokun
2017-06-08  9:50 ` Tomas
2017-06-09  3:50   ` Olaoluwa Osuntokun
2017-06-09  8:26     ` Tomas
2017-06-19 11:58 ` Andreas Schildbach
2017-06-19 12:26   ` bfd
2017-06-19 15:15     ` Tom Zander
2017-06-19 15:49       ` Jonas Schnelli
2017-06-19 15:59         ` Andreas Schildbach
2017-06-19 16:22           ` Jonas Schnelli
2017-06-19 16:36             ` adiabat
2017-06-19 20:49               ` Andreas Schildbach
2017-06-20  7:03             ` Eric Voskuil
2017-06-19 16:07         ` Tom Zander
2017-06-19 16:30           ` Jonas Schnelli
2017-06-19 16:38             ` Tom Zander
2017-06-19 15:43     ` Andreas Schildbach
2017-06-19 16:10       ` Jonas Schnelli
2017-06-19 22:41     ` Gregory Maxwell
2017-06-20  9:52       ` Tom Zander
2017-06-20 13:08         ` bfd
2017-06-20 17:20           ` Adam Back

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox