From: Peter Tschipper <peter.tschipper@gmail.com>
To: bitcoin-dev@lists.linuxfoundation.org
Subject: Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
Date: Wed, 11 Nov 2015 10:35:01 -0800 [thread overview]
Message-ID: <56438A55.2010604@gmail.com> (raw)
In-Reply-To: <CADm_WcYAj9_r6tu8Be-U81LDwWvnv04PZJMmc-S4cY7+jxfzGw@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 6772 bytes --]
Here are the latest results on compression ratios for the first 295,000
blocks, compressionlevel=6. I think there are more than enough
datapoints for statistical significance.
Results are very much similar to the previous test. I'll work on
getting a comparison between how much time savings/loss in time there is
when syncing the blockchains: compressed vs uncompressed. Still, I
think it's clear that serving up compressed blocks, at least historical
blocks, will be of benefit for those that have bandwidth caps on their
internet connections.
The proposal, so far is fairly simple:
1) compress blocks with some compression library: currently zlib but I
can investigate other possiblities
2) As a fall back we need to advertise compression as a service. That
way we can turn off compression AND decompression completely if needed.
3) Do the compression at the datastream level in the code. CDataStream
is the obvious place.
Test Results:
range = block size range
ubytes = average size of uncompressed blocks
cbytes = average size of compressed blocks
ctime = average time to compress
dtime = average time to decompress
cmp_ratio% = compression ratio
datapoints = number of datapoints taken
range ubytes cbytes ctime dtime cmp_ratio% datapoints
0-250b 215 189 0.001 0.000 12.40 91280
250-500b 438 404 0.001 0.000 7.85 13217
500-1KB 761 701 0.001 0.000 7.86 11434
1KB-10KB 4149 3547 0.001 0.000 14.51 52180
10KB-100KB 41934 32604 0.005 0.001 22.25 82890
100KB-200KB 146303 108080 0.016 0.001 26.13 29886
200KB-300KB 243299 179281 0.025 0.002 26.31 25066
300KB-400KB 344636 266177 0.036 0.003 22.77 4956
400KB-500KB 463201 356862 0.046 0.004 22.96 3167
500KB-600KB 545123 429854 0.056 0.005 21.15 366
600KB-700KB 647736 510931 0.065 0.006 21.12 254
700KB-800KB 746540 587287 0.073 0.008 21.33 294
800KB-900KB 868121 682650 0.087 0.008 21.36 199
900KB-1MB 945747 726307 0.091 0.010 23.20 304
On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
> Comments:
>
> 1) cblock seems a reasonable way to extend the protocol. Further
> wrapping should probably be done at the stream level.
>
> 2) zlib has crappy security track record.
>
> 3) A fallback path to non-compressed is required, should compression
> fail or crash.
>
> 4) Most blocks and transactions have runs of zeroes and/or highly
> common bit-patterns, which contributes to useful compression even at
> smaller sizes. Peter Ts's most recent numbers bear this out. zlib
> has a dictionary (32K?) which works well with repeated patterns such
> as those you see with concatenated runs of transactions.
>
> 5) LZO should provide much better compression, at a cost of CPU
> performance and using a less-reviewed, less-field-tested library.
>
>
>
>
>
> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
>
>
> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper
> <peter.tschipper@gmail.com <mailto:peter.tschipper@gmail.com>> wrote:
>
> There are better ways of sending new blocks, that's certainly
> true but for sending historical blocks and seding transactions
> I don't think so. This PR is really designed to save
> bandwidth and not intended to be a huge performance
> improvement in terms of time spent sending.
>
>
> If the main point is for historical data, then sticking to just
> blocks is the best plan.
>
> Since small blocks don't compress well, you could define a
> "cblocks" message that handles multiple blocks (just concatenate
> the block messages as payload before compression).
>
> The sending peer could combine blocks so that each cblock is
> compressing at least 10kB of block data (or whatever is optimal).
> It is probably worth specifying a maximum size for network buffer
> reasons (either 1MB or 1 block maximum).
>
> Similarly, transactions could be combined together and compressed
> "ctxs". The inv messages could be modified so that you can
> request groups of 10-20 transactions. That would depend on how
> much of an improvement compressed transactions would represent.
>
> More generally, you could define a message which is a compressed
> message holder. That is probably to complex to be worth the
> effort though.
>
>
>
>>
>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via
>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>>
>> I think 25% bandwidth savings is certainly
>> considerable, especially for people running full
>> nodes in countries like Australia where internet
>> bandwidth is lower and there are data caps.
>>
>>
>> This reinforces the idea that such trade-off decisions
>> should be be local and negotiated between peers, not a
>> required feature of the network P2P.
>>
>>
>> --
>> Johnathan Corgan
>> Corgan Labs - SDR Training and Development Services
>> http://corganlabs.com
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
[-- Attachment #2: Type: text/html, Size: 16946 bytes --]
next prev parent reply other threads:[~2015-11-11 18:35 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-09 19:18 [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
2015-11-09 20:41 ` Johnathan Corgan
2015-11-09 21:04 ` Bob McElrath
2015-11-10 1:58 ` gladoscc
2015-11-10 5:40 ` Johnathan Corgan
2015-11-10 9:44 ` Tier Nolan
[not found] ` <5642172C.701@gmail.com>
2015-11-10 16:17 ` Peter Tschipper
2015-11-10 16:21 ` Jonathan Toomim
2015-11-10 16:30 ` Tier Nolan
2015-11-10 16:46 ` Jeff Garzik
2015-11-10 17:09 ` Peter Tschipper
2015-11-11 18:35 ` Peter Tschipper [this message]
2015-11-11 18:49 ` Marco Pontello
2015-11-11 19:05 ` Jonathan Toomim
2015-11-13 21:58 ` [bitcoin-dev] Block Compression (Datastream Compression) test results using the PR#6973 compression prototype Peter Tschipper
2015-11-18 14:00 ` [bitcoin-dev] More findings: " Peter Tschipper
2015-11-11 19:11 ` [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
2015-11-28 14:48 ` [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's" Peter Tschipper
2015-11-29 0:30 ` Jonathan Toomim
2015-11-29 5:15 ` Peter Tschipper
[not found] ` <56421F1E.4050302@gmail.com>
2015-11-10 16:46 ` [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56438A55.2010604@gmail.com \
--to=peter.tschipper@gmail.com \
--cc=bitcoin-dev@lists.linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox