From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <karljohan-alm@garage.co.jp>
Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org
	[172.17.192.35])
	by mail.linuxfoundation.org (Postfix) with ESMTPS id 38FD1901
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon,  5 Jun 2017 02:07:11 +0000 (UTC)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from mo.garage.hdemail.jp (mo.garage.hdemail.jp [46.51.242.127])
	by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 48BFC196
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon,  5 Jun 2017 02:07:10 +0000 (UTC)
Received: from ip-10-217-1-36.ap-northeast-1.compute.internal
	(localhost.localdomain [127.0.0.1])
	by mo.garage.hdemail.jp (hde-mf-postfix) with SMTP id 6F39D14C0C5
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon,  5 Jun 2017 11:07:06 +0900 (JST)
	(envelope-from karljohan-alm@garage.co.jp)
X-Received: from unknown (HELO mo.garage.hdemail.jp) (127.0.0.1)
	by 0 with SMTP; 5 Jun 2017 11:07:02 +0900
X-Received: from mo.garage.hdemail.jp (localhost.localdomain [127.0.0.1])
	by mo.garage.hdemail.jp (hde-ma-postfix) with ESMTP id D4A7E4C085
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon,  5 Jun 2017 11:07:01 +0900 (JST)
	(envelope-from karljohan-alm@garage.co.jp)
Received: from gw27.oz.hdemail.jp
	(ip-10-127-141-142.ap-northeast-1.compute.internal [10.127.141.142])
	by mo.garage.hdemail.jp (hde-mf-postfix) with ESMTP id 4738014C0C1
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon,  5 Jun 2017 11:07:00 +0900 (JST)
	(envelope-from karljohan-alm@garage.co.jp)
X-Durian-MailFrom: karljohan-alm@garage.co.jp
X-Durian-RcptTo: bitcoin-dev@lists.linuxfoundation.org
Received: from gw27.oz.hdemail.jp (gw27.oz.hdemail.jp [127.0.0.1])
	by gw27.oz.hdemail.jp (gw27.oz.hdemail.jp [127.0.0.1]);
	Mon, 5 Jun 2017 11:06:55 +0900
X-Received: from mail-qt0-f198.google.com (lb1.oz.lo.hdemail.jp
	[54.248.222.53])
	(using TLSv1 with cipher AES128-SHA (128/128 bits))
	(No client certificate requested)
	by gw27.oz.hdemail.jp (Postfix) with ESMTP id F41D4148C117
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Mon,  5 Jun 2017 11:06:54 +0900 (JST)
X-Received: by mail-qt0-f198.google.com with SMTP id y31so55990259qty.7
	for <bitcoin-dev@lists.linuxfoundation.org>;
	Sun, 04 Jun 2017 19:06:54 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:mime-version:in-reply-to:references:from:date
	:message-id:subject:to:cc;
	bh=AE7yjGjMeJqgLJiTA4DPZJ1lOuLnHBLRHO68oi5mjII=;
	b=MCVbP+2sxVSXtGxUbnDCGmSm2130TOL1Jwg/F1FIUCRMM88wgn7Y4basF3650UPbct
	XZ73kkBrsHH3fBtEQ6rrCQrLOeCiyvZJGbnr1WrYmE/vf+gUia8nvp/spNSP+ANQOnU3
	Y/ucr1vqVZo45/jAk5E+fOriJeLSKT+aS1+RbZq+GILZo9XFy7Ophc2J0pkAzyM0noDG
	KIm0feTWFFOt7WkSBk87wNsYqa2xzf86DlWGKlA7zVLIj2UtRdpMlDlNzee+Yy0eb+79
	iX9r6k56SMOpjGARX4ZTSxqxa87hh/fX/NjpnXKxzj8EGq8MaQAMeokKdi3uc7zdKiXf
	TC7Q==
X-Gm-Message-State: AKS2vOzzwEhw3nqGZ7fQK0joI0MwlBxx5sznlssbqhZpMcKuPSYNf0EV
	IX2lUAl1ApuDqcoJN2YxouMRWNtfmqs1tBuBipmmDbmVDHWm0i8Jqmn1EnzcwzTqXTrVO+GUp3l
	AoS9b/7fUP7n4hXu50AMaSXRH0xEu7oDPD9dPnahnhmUSRg==
X-Received: by 10.200.4.162 with SMTP id s34mr22716993qtg.35.1496628412855;
	Sun, 04 Jun 2017 19:06:52 -0700 (PDT)
X-Received: by 10.200.4.162 with SMTP id s34mr22716973qtg.35.1496628412633;
	Sun, 04 Jun 2017 19:06:52 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.12.146.7 with HTTP; Sun, 4 Jun 2017 19:06:32 -0700 (PDT)
In-Reply-To: <CAE0pnxK5r2XfVks=emkK=v66XRN5c-Sz-Lm_dKY+6nO=kPk6Vw@mail.gmail.com>
References: <CAO3Pvs8ccTkgrecJG6KFbBW+9moHF-FTU+4qNfayeE3hM9uRrg@mail.gmail.com>
	<CALJw2w5gUgbdX7XnxPsK2FZ6PZ5cSTgmCEqiPu7-S4gwXBM-_Q@mail.gmail.com>
	<CAE0pnx+RRAP269VeWAcxKbrcS9qX4LS8_6nY_js8X5NtQ22t_A@mail.gmail.com>
	<CAE0pnxLKYnwHnktTqW949s1AA9uK=6WnVYWmRoau8B1SszzYEg@mail.gmail.com>
	<CAE0pnxJxHYQ4+2pt3tt=1WZ0-K0vDxGB4KBXY+R=WfktMmATwA@mail.gmail.com>
	<CAE0pnxK5r2XfVks=emkK=v66XRN5c-Sz-Lm_dKY+6nO=kPk6Vw@mail.gmail.com>
From: Karl Johan Alm <karljohan-alm@garage.co.jp>
Date: Mon, 5 Jun 2017 11:06:32 +0900
Message-ID: <CALJw2w6Vzq8PO3x607=ERK4XKU2vrHApqKP2rWm-sw2r1ZOJMw@mail.gmail.com>
To: Alex Akselrod <alex@akselrod.org>
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham
	version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	smtp1.linux-foundation.org
X-Mailman-Approved-At: Mon, 05 Jun 2017 02:17:45 +0000
Cc: Bitcoin Dev <bitcoin-dev@lists.linuxfoundation.org>
Subject: Re: [bitcoin-dev] BIP Proposal: Compact Client Side Filtering for
 Light Clients
X-BeenThere: bitcoin-dev@lists.linuxfoundation.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Bitcoin Protocol Discussion <bitcoin-dev.lists.linuxfoundation.org>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/>
List-Post: <mailto:bitcoin-dev@lists.linuxfoundation.org>
List-Help: <mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>,
	<mailto:bitcoin-dev-request@lists.linuxfoundation.org?subject=subscribe>
X-List-Received-Date: Mon, 05 Jun 2017 02:07:11 -0000

On Sat, Jun 3, 2017 at 2:55 AM, Alex Akselrod via bitcoin-dev
<bitcoin-dev@lists.linuxfoundation.org> wrote:
> Without a soft fork, this is the only way for light clients to verify that
> peers aren't lying to them. Clients can request headers (just hashes of the
> filters and the previous headers, creating a chain) and look for conflicts
> between peers. If a conflict is found at a certain block, the client can
> download the block, generate a filter, calculate the header by hashing
> together the previous header and the generated filter, and banning any peers
> that don't match. A full node could prune old filters if you wanted and
> recalculate them as necessary if you just keep the filter header chain info
> as really old filters are unlikely to be requested by correctly written
> software but you can't guarantee every client will follow best practices
> either.

Ahh, so you actually make a separate digest chain with prev hashes and
everything. Once/if committed digests are soft forked in, it seems a
bit overkill but maybe it's worth it. (I was always assuming committed
digests in coinbase would come after people started using this, and
that people could just ask a couple of random peers for the digest
hash and ensure everyone gave the same answer as the hash of the
downloaded digest..).

> The simulations are based on completely random data within given parameters.

I noticed an increase in FP hits when using real data sampled from
real scriptPubKeys and such. Address reuse and other weird stuff. See
"lies.h" in github repo for experiments and chainsim.c initial part of
main where wallets get random stuff from the chain.

> I will definitely try to reproduce my experiments with Golomb-Coded
> sets and see what I come up with. It seems like you've got a little
> less than half the size of my digests for 1-block digests but I
> haven't tried making digests for all blocks (and lots of early blocks
> are empty).
>
>
> Filters for empty blocks only take a few bytes and sometimes zero when the
> coinbase output is a burn that doesn't push any data (example will be in the
> test vectors that I'll have ready shortly).

I created digests for all blocks up until block #469805 and actually
ended up with 5.8 GB, which is 1.1 GB lower than what you have, but
may be worse perf-wise on false positive rates and such.

> How fast are these to create? Would it make sense to provide digests
> on demand in some cases, rather than keeping them around indefinitely?
>
>
> They're pretty fast and can be pruned if desired, as mentioned above, as
> long as the header chain is kept.

For comparison, creating the digests above (469805 of them) took
roughly 30 mins on my end, but using the kstats format so probably
higher on an actual node (should get around to profiling that...).