Thanks for sending this proposal! I look forward to having a great discussion around this.
Hi y'all,Alex Akselrod and I would like to propose a new light client BIP forconsideration:This BIP proposal describes a concrete specification (along with areference implementations[1][2][3]) for the much discussed client-sidefiltering reversal of BIP-37. The precise details are described in theBIP, but as a summary: we've implemented a new light-client mode that usesclient-side filtering based off of Golomb-Rice coded sets. Full-nodesmaintain an additional index of the chain, and serve this compact filter(the index) to light clients which request them. Light clients then fetchthese filters, query the locally and _maybe_ fetch the block if a relevantitem matches. The cool part is that blocks can be fetched from _any_source, once the light client deems it necessary. Our primary motivationfor this work was enabling a light client mode for lnd[4] in order tosupport a more light-weight back end paving the way for the usage ofLightning on mobile phones and other devices. We've integrated neutrinoas a back end for lnd, and will be making the updated code public verysoon.One specific area we'd like feedback on is the parameter selection. UnlikeBIP-37 which allows clients to dynamically tune their false positive rate,our proposal uses a _fixed_ false-positive. Within the document, it'scurrently specified as P = 1/2^20. We've done a bit of analysis andoptimization attempting to optimize the following sum:filter_download_bandwidth + expected_block_false_positive_bandwidth. Alex has made a JS calculator that allows y'all to explore the affect oftweaking the false positive rate in addition to the following variables:the number of items the wallet is scanning for, the size of the blocks,number of blocks fetched, and the size of the filters themselves. Thecalculator calculates the expected bandwidth utilization using the CDF ofthe Geometric Distribution. The calculator can be found here:https://aakselrod.github.io/gcs_calc.html . Alex also has an empiricalscript he's been running on actual data, and the results seem to match uprather nicely.We we're excited to see that Karl Johan Alm (kallewoof) has done some(rather extensive!) analysis of his own, focusing on a distinct encodingtype [5]. I haven't had the time yet to dig into his report yet, but Ithink I've read enough to extract the key difference in our encodings: hisfilters use a binomial encoding _directly_ on the filter contents, will weinstead create a Golomb-Coded set with the contents being _hashes_ (we usesiphash) of the filter items.Using a fixed fp=20, I have some stats detailing the total index size, aswell as averages for both mainnet and testnet. For mainnet, using thefilter contents as currently described in the BIP (basic + extended), thetotal size of the index comes out to 6.9GB. The break down is as follows:* total size: 6976047156* total avg: 14997.220622758816* total median: 3801* total max: 79155* regular size: 3117183743* regular avg: 6701.372750217131* regular median: 1734* regular max: 67533* extended size: 3858863413* extended avg: 8295.847872541684* extended median: 2041* extended max: 52508In order to consider the average+median filter sizes in a world worthlarger blocks, I also ran the index for testnet:* total size: 2753238530* total avg: 5918.95736054141* total median: 60202* total max: 74983* regular size: 1165148878* regular avg: 2504.856172982827* regular median: 24812* regular max: 64554* extended size: 1588089652* extended avg: 3414.1011875585823* extended median: 35260* extended max: 41731Finally, here are the testnet stats which take into account the increasein the maximum filter size due to segwit's block-size increase. The maxfilter sizes are a bit larger due to some of the habitual blocks Icreated last year when testing segwit (transactions with 30k inputs, 30koutputs, etc).* total size: 585087597* total avg: 520.8839608674402* total median: 20* total max: 164598* regular size: 299325029* regular avg: 266.4790836307566* regular median: 13* regular max: 164583* extended size: 285762568* extended avg: 254.4048772366836* extended median: 7* extended max: 127631For those that are interested in the raw data, I've uploaded a CSV fileof raw data for each block (mainnet + testnet), which can be found here:* mainnet: (14MB): https://www.dropbox.com/s/4yk2u8dj06njbuv/mainnet-gcs- stats.csv?dl=0 * testnet: (25MB): https://www.dropbox.com/s/w7dmmcbocnmjfbo/gcs-stats- testnet.csv?dl=0 We look forward to getting feedback from all of y'all!-- Laolu-- Laolu