* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-10 16:46 ` Jeff Garzik
@ 2015-11-10 17:09 ` Peter Tschipper
2015-11-11 18:35 ` Peter Tschipper
2015-11-28 14:48 ` [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's" Peter Tschipper
2 siblings, 0 replies; 21+ messages in thread
From: Peter Tschipper @ 2015-11-10 17:09 UTC (permalink / raw)
To: bitcoin-dev
[-- Attachment #1: Type: text/plain, Size: 4893 bytes --]
On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
> Comments:
>
> 1) cblock seems a reasonable way to extend the protocol. Further
> wrapping should probably be done at the stream level.
agreed.
>
> 2) zlib has crappy security track record.
>
Zlib had a bad buffer overflow bug but that was in 2005 and it got a lot
of press at the time. It's was fixed in version 1.2.3...we're on 1.2.8
now. I'm not aware of any other current issues with zlib. Do you have a
citation?
> 3) A fallback path to non-compressed is required, should compression
> fail or crash.
agreed.
>
> 4) Most blocks and transactions have runs of zeroes and/or highly
> common bit-patterns, which contributes to useful compression even at
> smaller sizes. Peter Ts's most recent numbers bear this out. zlib
> has a dictionary (32K?) which works well with repeated patterns such
> as those you see with concatenated runs of transactions.
>
> 5) LZO should provide much better compression, at a cost of CPU
> performance and using a less-reviewed, less-field-tested library.
I don't think LZO will give as good compression here but I will do some
benchmarking when I can.
>
>
>
>
>
> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
>
>
> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper
> <peter.tschipper@gmail.com <mailto:peter.tschipper@gmail.com>> wrote:
>
> There are better ways of sending new blocks, that's certainly
> true but for sending historical blocks and seding transactions
> I don't think so. This PR is really designed to save
> bandwidth and not intended to be a huge performance
> improvement in terms of time spent sending.
>
>
> If the main point is for historical data, then sticking to just
> blocks is the best plan.
>
> Since small blocks don't compress well, you could define a
> "cblocks" message that handles multiple blocks (just concatenate
> the block messages as payload before compression).
>
> The sending peer could combine blocks so that each cblock is
> compressing at least 10kB of block data (or whatever is optimal).
> It is probably worth specifying a maximum size for network buffer
> reasons (either 1MB or 1 block maximum).
>
> Similarly, transactions could be combined together and compressed
> "ctxs". The inv messages could be modified so that you can
> request groups of 10-20 transactions. That would depend on how
> much of an improvement compressed transactions would represent.
>
> More generally, you could define a message which is a compressed
> message holder. That is probably to complex to be worth the
> effort though.
>
>
>
>>
>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via
>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>>
>> I think 25% bandwidth savings is certainly
>> considerable, especially for people running full
>> nodes in countries like Australia where internet
>> bandwidth is lower and there are data caps.
>>
>>
>> This reinforces the idea that such trade-off decisions
>> should be be local and negotiated between peers, not a
>> required feature of the network P2P.
>>
>>
>> --
>> Johnathan Corgan
>> Corgan Labs - SDR Training and Development Services
>> http://corganlabs.com
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
[-- Attachment #2: Type: text/html, Size: 15022 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-10 16:46 ` Jeff Garzik
2015-11-10 17:09 ` Peter Tschipper
@ 2015-11-11 18:35 ` Peter Tschipper
2015-11-11 18:49 ` Marco Pontello
2015-11-28 14:48 ` [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's" Peter Tschipper
2 siblings, 1 reply; 21+ messages in thread
From: Peter Tschipper @ 2015-11-11 18:35 UTC (permalink / raw)
To: bitcoin-dev
[-- Attachment #1: Type: text/plain, Size: 6772 bytes --]
Here are the latest results on compression ratios for the first 295,000
blocks, compressionlevel=6. I think there are more than enough
datapoints for statistical significance.
Results are very much similar to the previous test. I'll work on
getting a comparison between how much time savings/loss in time there is
when syncing the blockchains: compressed vs uncompressed. Still, I
think it's clear that serving up compressed blocks, at least historical
blocks, will be of benefit for those that have bandwidth caps on their
internet connections.
The proposal, so far is fairly simple:
1) compress blocks with some compression library: currently zlib but I
can investigate other possiblities
2) As a fall back we need to advertise compression as a service. That
way we can turn off compression AND decompression completely if needed.
3) Do the compression at the datastream level in the code. CDataStream
is the obvious place.
Test Results:
range = block size range
ubytes = average size of uncompressed blocks
cbytes = average size of compressed blocks
ctime = average time to compress
dtime = average time to decompress
cmp_ratio% = compression ratio
datapoints = number of datapoints taken
range ubytes cbytes ctime dtime cmp_ratio% datapoints
0-250b 215 189 0.001 0.000 12.40 91280
250-500b 438 404 0.001 0.000 7.85 13217
500-1KB 761 701 0.001 0.000 7.86 11434
1KB-10KB 4149 3547 0.001 0.000 14.51 52180
10KB-100KB 41934 32604 0.005 0.001 22.25 82890
100KB-200KB 146303 108080 0.016 0.001 26.13 29886
200KB-300KB 243299 179281 0.025 0.002 26.31 25066
300KB-400KB 344636 266177 0.036 0.003 22.77 4956
400KB-500KB 463201 356862 0.046 0.004 22.96 3167
500KB-600KB 545123 429854 0.056 0.005 21.15 366
600KB-700KB 647736 510931 0.065 0.006 21.12 254
700KB-800KB 746540 587287 0.073 0.008 21.33 294
800KB-900KB 868121 682650 0.087 0.008 21.36 199
900KB-1MB 945747 726307 0.091 0.010 23.20 304
On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
> Comments:
>
> 1) cblock seems a reasonable way to extend the protocol. Further
> wrapping should probably be done at the stream level.
>
> 2) zlib has crappy security track record.
>
> 3) A fallback path to non-compressed is required, should compression
> fail or crash.
>
> 4) Most blocks and transactions have runs of zeroes and/or highly
> common bit-patterns, which contributes to useful compression even at
> smaller sizes. Peter Ts's most recent numbers bear this out. zlib
> has a dictionary (32K?) which works well with repeated patterns such
> as those you see with concatenated runs of transactions.
>
> 5) LZO should provide much better compression, at a cost of CPU
> performance and using a less-reviewed, less-field-tested library.
>
>
>
>
>
> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
>
>
> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper
> <peter.tschipper@gmail.com <mailto:peter.tschipper@gmail.com>> wrote:
>
> There are better ways of sending new blocks, that's certainly
> true but for sending historical blocks and seding transactions
> I don't think so. This PR is really designed to save
> bandwidth and not intended to be a huge performance
> improvement in terms of time spent sending.
>
>
> If the main point is for historical data, then sticking to just
> blocks is the best plan.
>
> Since small blocks don't compress well, you could define a
> "cblocks" message that handles multiple blocks (just concatenate
> the block messages as payload before compression).
>
> The sending peer could combine blocks so that each cblock is
> compressing at least 10kB of block data (or whatever is optimal).
> It is probably worth specifying a maximum size for network buffer
> reasons (either 1MB or 1 block maximum).
>
> Similarly, transactions could be combined together and compressed
> "ctxs". The inv messages could be modified so that you can
> request groups of 10-20 transactions. That would depend on how
> much of an improvement compressed transactions would represent.
>
> More generally, you could define a message which is a compressed
> message holder. That is probably to complex to be worth the
> effort though.
>
>
>
>>
>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via
>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>>
>> I think 25% bandwidth savings is certainly
>> considerable, especially for people running full
>> nodes in countries like Australia where internet
>> bandwidth is lower and there are data caps.
>>
>>
>> This reinforces the idea that such trade-off decisions
>> should be be local and negotiated between peers, not a
>> required feature of the network P2P.
>>
>>
>> --
>> Johnathan Corgan
>> Corgan Labs - SDR Training and Development Services
>> http://corganlabs.com
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
[-- Attachment #2: Type: text/html, Size: 16946 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-11 18:35 ` Peter Tschipper
@ 2015-11-11 18:49 ` Marco Pontello
2015-11-11 19:05 ` Jonathan Toomim
2015-11-11 19:11 ` [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
0 siblings, 2 replies; 21+ messages in thread
From: Marco Pontello @ 2015-11-11 18:49 UTC (permalink / raw)
To: Peter Tschipper; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 7383 bytes --]
A random thought: aren't most communication over a data link already
compressed, at some point?
When I used a modem, we had the V.42bis protocol. Now, nearly all ADSL
connections using PPPoE, surely are. And so on.
I'm not sure another level of generic, data agnostic kind of compression
will really give us some real-life practical advantage over that.
Something that could take advantage of of special knowledge of the specific
data, instead, would be an entirely different matter.
Just my 2c.
On Wed, Nov 11, 2015 at 7:35 PM, Peter Tschipper via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:
> Here are the latest results on compression ratios for the first 295,000
> blocks, compressionlevel=6. I think there are more than enough datapoints
> for statistical significance.
>
> Results are very much similar to the previous test. I'll work on getting
> a comparison between how much time savings/loss in time there is when
> syncing the blockchains: compressed vs uncompressed. Still, I think it's
> clear that serving up compressed blocks, at least historical blocks, will
> be of benefit for those that have bandwidth caps on their internet
> connections.
>
> The proposal, so far is fairly simple:
> 1) compress blocks with some compression library: currently zlib but I can
> investigate other possiblities
> 2) As a fall back we need to advertise compression as a service. That way
> we can turn off compression AND decompression completely if needed.
> 3) Do the compression at the datastream level in the code. CDataStream is
> the obvious place.
>
>
> Test Results:
>
> range = block size range
> ubytes = average size of uncompressed blocks
> cbytes = average size of compressed blocks
> ctime = average time to compress
> dtime = average time to decompress
> cmp_ratio% = compression ratio
> datapoints = number of datapoints taken
>
> range ubytes cbytes ctime dtime cmp_ratio% datapoints
> 0-250b 215 189 0.001 0.000 12.40 91280
> 250-500b 438 404 0.001 0.000 7.85 13217
> 500-1KB 761 701 0.001 0.000 7.86
> 11434
> 1KB-10KB 4149 3547 0.001 0.000 14.51 52180
> 10KB-100KB 41934 32604 0.005 0.001 22.25 82890
> 100KB-200KB 146303 108080 0.016 0.001 26.13 29886
> 200KB-300KB 243299 179281 0.025 0.002 26.31 25066
> 300KB-400KB 344636 266177 0.036 0.003 22.77 4956
> 400KB-500KB 463201 356862 0.046 0.004 22.96 3167
> 500KB-600KB 545123 429854 0.056 0.005 21.15 366
> 600KB-700KB 647736 510931 0.065 0.006 21.12 254
> 700KB-800KB 746540 587287 0.073 0.008 21.33 294
> 800KB-900KB 868121 682650 0.087 0.008 21.36 199
> 900KB-1MB 945747 726307 0.091 0.010 23.20 304
>
> On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
>
> Comments:
>
> 1) cblock seems a reasonable way to extend the protocol. Further wrapping
> should probably be done at the stream level.
>
> 2) zlib has crappy security track record.
>
> 3) A fallback path to non-compressed is required, should compression fail
> or crash.
>
> 4) Most blocks and transactions have runs of zeroes and/or highly common
> bit-patterns, which contributes to useful compression even at smaller
> sizes. Peter Ts's most recent numbers bear this out. zlib has a
> dictionary (32K?) which works well with repeated patterns such as those you
> see with concatenated runs of transactions.
>
> 5) LZO should provide much better compression, at a cost of CPU
> performance and using a less-reviewed, less-field-tested library.
>
>
>
>
>
> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev <
> <bitcoin-dev@lists.linuxfoundation.org>
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>>
>>
>> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper <
>> <peter.tschipper@gmail.com>peter.tschipper@gmail.com> wrote:
>>
>>> There are better ways of sending new blocks, that's certainly true but
>>> for sending historical blocks and seding transactions I don't think so.
>>> This PR is really designed to save bandwidth and not intended to be a huge
>>> performance improvement in terms of time spent sending.
>>>
>>
>> If the main point is for historical data, then sticking to just blocks is
>> the best plan.
>>
>> Since small blocks don't compress well, you could define a "cblocks"
>> message that handles multiple blocks (just concatenate the block messages
>> as payload before compression).
>>
>> The sending peer could combine blocks so that each cblock is compressing
>> at least 10kB of block data (or whatever is optimal). It is probably worth
>> specifying a maximum size for network buffer reasons (either 1MB or 1 block
>> maximum).
>>
>> Similarly, transactions could be combined together and compressed
>> "ctxs". The inv messages could be modified so that you can request groups
>> of 10-20 transactions. That would depend on how much of an improvement
>> compressed transactions would represent.
>>
>> More generally, you could define a message which is a compressed message
>> holder. That is probably to complex to be worth the effort though.
>>
>>
>>
>>>
>>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via bitcoin-dev <
>>> <bitcoin-dev@lists.linuxfoundation.org>
>>> bitcoin-dev@lists.linuxfoundation.org> wrote:
>>>
>>>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev <
>>>> <bitcoin-dev@lists.linuxfoundation.org>
>>>> bitcoin-dev@lists.linuxfoundation.org> wrote:
>>>>
>>>>
>>>>> I think 25% bandwidth savings is certainly considerable, especially
>>>>> for people running full nodes in countries like Australia where internet
>>>>> bandwidth is lower and there are data caps.
>>>>>
>>>>
>>>> This reinforces the idea that such trade-off decisions should be be
>>>> local and negotiated between peers, not a required feature of the network
>>>> P2P.
>>>>
>>>>
>>>> --
>>>> Johnathan Corgan
>>>> Corgan Labs - SDR Training and Development Services
>>>> <http://corganlabs.com>http://corganlabs.com
>>>>
>>>> _______________________________________________
>>>> bitcoin-dev mailing list
>>>> bitcoin-dev@lists.linuxfoundation.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> bitcoin-dev mailing listbitcoin-dev@lists.linuxfoundation.orghttps://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>>
>>>
>>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>
>
> _______________________________________________
> bitcoin-dev mailing listbitcoin-dev@lists.linuxfoundation.orghttps://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
--
Try the Online TrID File Identifier
http://mark0.net/onlinetrid.aspx
[-- Attachment #2: Type: text/html, Size: 17296 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-11 18:49 ` Marco Pontello
@ 2015-11-11 19:05 ` Jonathan Toomim
2015-11-13 21:58 ` [bitcoin-dev] Block Compression (Datastream Compression) test results using the PR#6973 compression prototype Peter Tschipper
2015-11-11 19:11 ` [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
1 sibling, 1 reply; 21+ messages in thread
From: Jonathan Toomim @ 2015-11-11 19:05 UTC (permalink / raw)
To: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 870 bytes --]
Data compression adds latency and reduces predictability, so engineers have decided to leave compression to application layers instead of transport layer or lower in order to let the application designer decide what tradeoffs to make.
On Nov 11, 2015, at 10:49 AM, Marco Pontello via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
> A random thought: aren't most communication over a data link already compressed, at some point?
> When I used a modem, we had the V.42bis protocol. Now, nearly all ADSL connections using PPPoE, surely are. And so on.
> I'm not sure another level of generic, data agnostic kind of compression will really give us some real-life practical advantage over that.
>
> Something that could take advantage of of special knowledge of the specific data, instead, would be an entirely different matter.
>
> Just my 2c.
[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 496 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* [bitcoin-dev] Block Compression (Datastream Compression) test results using the PR#6973 compression prototype
2015-11-11 19:05 ` Jonathan Toomim
@ 2015-11-13 21:58 ` Peter Tschipper
2015-11-18 14:00 ` [bitcoin-dev] More findings: " Peter Tschipper
0 siblings, 1 reply; 21+ messages in thread
From: Peter Tschipper @ 2015-11-13 21:58 UTC (permalink / raw)
To: bitcoin-dev
[-- Attachment #1: Type: text/plain, Size: 1931 bytes --]
Some further Block Compression tests results that compare performance
when network latency is added to the mix.
Running two nodes, windows 7, compressionlevel=6, syncing the first
200000 blocks from one node to another. Running on a highspeed wireless
LAN with no connections to the outside world.
Network latency was added by using Netbalancer to induce the 30ms and
60ms latencies.
From the data not only are bandwidth savings seen but also a small
performance savings as well. However, the overall the value in
compressing blocks appears to be in terms of saving bandwidth.
I was also surprised to see that there was no real difference in
performance when no latency was present; apparently the time it takes to
compress is about equal to the performance savings in such a situation.
The following results compare the tests in terms of how long it takes to
sync the blockchain, compressed vs uncompressed and with varying latencies.
uncmp = uncompressed
cmp = compressed
num blocks sync'd uncmp (secs) cmp (secs) uncmp 30ms (secs) cmp 30ms
(secs) uncmp 60ms (secs) cmp 60ms (secs)
10000 264 269 265 257 274 275
20000 482 492 479 467 499 497
30000 703 717 693 676 724 724
40000 918 939 902 886 947 944
50000 1140 1157 1114 1094 1171 1167
60000 1362 1380 1329 1310 1400 1395
70000 1583 1597 1547 1526 1637 1627
80000 1810 1817 1767 1745 1872 1862
90000 2031 2036 1985 1958 2109 2098
100000 2257 2260 2223 2184 2385 2355
110000 2553 2486 2478 2422 2755 2696
120000 2800 2724 2849 2771 3345 3254
130000 3078 2994 3356 3257 4125 4006
140000 3442 3365 3979 3870 5032 4904
150000 3803 3729 4586 4464 5928 5797
160000 4148 4075 5168 5034 6801 6661
170000 4509 4479 5768 5619 7711 7557
180000 4947 4924 6389 6227 8653 8479
190000 5858 5855 7302 7107 9768 9566
200000 6980 6969 8469 8220 10944 10724
[-- Attachment #2: Type: text/html, Size: 10768 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* [bitcoin-dev] More findings: Block Compression (Datastream Compression) test results using the PR#6973 compression prototype
2015-11-13 21:58 ` [bitcoin-dev] Block Compression (Datastream Compression) test results using the PR#6973 compression prototype Peter Tschipper
@ 2015-11-18 14:00 ` Peter Tschipper
0 siblings, 0 replies; 21+ messages in thread
From: Peter Tschipper @ 2015-11-18 14:00 UTC (permalink / raw)
To: bitcoin-dev
[-- Attachment #1: Type: text/plain, Size: 5198 bytes --]
Hi all,
I'm still doing a little more investigation before opening up a formal
bip PR, but getting close. Here are some more findings.
After moving the compression from main.cpp to streams.h (CDataStream) it
was a simple matter to add compression to transactions as well. Results
as follows:
range = block size range
ubytes = average size of uncompressed transactions
cbytes = average size of compressed transactions
cmp_ratio% = compression ratio
datapoints = number of datapoints taken
range ubytes cbytes cmp_ratio% datapoints
0-250b 220 227 -3.16 23780
250-500b 356 354 0.68 20882
500-600 534 505 5.29 2772
600-700 653 608 6.95 1853
700-800 757 649 14.22 578
800-900 822 758 7.77 661
900-1KB 954 862 9.69 906
1KB-10KB 2698 2222 17.64 3370
10KB-100KB 15463 12092 21.8 15429
A couple of obvious observations. Transactions don't compress well
below 500 bytes but do very well beyond 1KB where there are a great deal
of those large spam type transactions. However, most transactions
happen to be in the < 500 byte range. So the next step was to appy
bundling, or the creating of a "blob" for those smaller transactions, if
and only if there are multiple tx's in the getdata receive queue for a
peer. Doing that yields some very good compression ratios. Some
examples as follows:
The best one I've seen so far was the following where 175 transactions
were bundled into one blob before being compressed. That yielded a 20%
compression ratio, but that doesn't take into account the savings from
the unneeded 174 message headers (24 bytes each) as well as 174 TCP
ACK's of 52 bytes each which yields and additional 76*174=13224 bytes,
making the overall bandwidth savings 32%, in this particular case.
*2015-11-18 01:09:09.002061 compressed blob from 79890 to 67426 txcount:175*
To be sure, this was an extreme example. Most transaction blobs were in
the 2 to 10 transaction range. Such as the following:
*2015-11-17 21:08:28.469313 compressed blob from 3199 to 2876 txcount:10*
But even here the savings are 10%, far better than the "nothing" we
would get without bundling, but add to that the 76 byte * 9 transaction
savings and we have a total 20% savings in bandwidth for transactions
that otherwise would not be compressible.
The same bundling was applied to blocks and very good compression ratios
are seen when sync'ing the blockchain.
Overall the bundling or blobbing of tx's and blocks seems to be a good
idea for improving bandwith use but also there is a scalability factor
here, when the system is busy, transactions are bundled more often,
compressed, sent faster, keeping message queue and network chatter to a
minimum.
I think I have enough information to put together a formal BIP with the
exception of which compression library to implement. These tests were
done using ZLib but I'll also be running tests in the coming days with
LZO (Jeff Garzik's suggestion) and perhaps Snappy. If there are any
other libraries that people would like me to get results for please let
me know and I'll pick maybe the top 2 or 3 and get results back to the
group.
On 13/11/2015 1:58 PM, Peter Tschipper wrote:
> Some further Block Compression tests results that compare performance
> when network latency is added to the mix.
>
> Running two nodes, windows 7, compressionlevel=6, syncing the first
> 200000 blocks from one node to another. Running on a highspeed
> wireless LAN with no connections to the outside world.
> Network latency was added by using Netbalancer to induce the 30ms and
> 60ms latencies.
>
> From the data not only are bandwidth savings seen but also a small
> performance savings as well. However, the overall the value in
> compressing blocks appears to be in terms of saving bandwidth.
>
> I was also surprised to see that there was no real difference in
> performance when no latency was present; apparently the time it takes
> to compress is about equal to the performance savings in such a situation.
>
>
> The following results compare the tests in terms of how long it takes
> to sync the blockchain, compressed vs uncompressed and with varying
> latencies.
> uncmp = uncompressed
> cmp = compressed
>
> num blocks sync'd uncmp (secs) cmp (secs) uncmp 30ms (secs) cmp
> 30ms (secs) uncmp 60ms (secs) cmp 60ms (secs)
> 10000 264 269 265 257 274 275
> 20000 482 492 479 467 499 497
> 30000 703 717 693 676 724 724
> 40000 918 939 902 886 947 944
> 50000 1140 1157 1114 1094 1171 1167
> 60000 1362 1380 1329 1310 1400 1395
> 70000 1583 1597 1547 1526 1637 1627
> 80000 1810 1817 1767 1745 1872 1862
> 90000 2031 2036 1985 1958 2109 2098
> 100000 2257 2260 2223 2184 2385 2355
> 110000 2553 2486 2478 2422 2755 2696
> 120000 2800 2724 2849 2771 3345 3254
> 130000 3078 2994 3356 3257 4125 4006
> 140000 3442 3365 3979 3870 5032 4904
> 150000 3803 3729 4586 4464 5928 5797
> 160000 4148 4075 5168 5034 6801 6661
> 170000 4509 4479 5768 5619 7711 7557
> 180000 4947 4924 6389 6227 8653 8479
> 190000 5858 5855 7302 7107 9768 9566
> 200000 6980 6969 8469 8220 10944 10724
>
>
[-- Attachment #2: Type: text/html, Size: 18590 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-11 18:49 ` Marco Pontello
2015-11-11 19:05 ` Jonathan Toomim
@ 2015-11-11 19:11 ` Peter Tschipper
1 sibling, 0 replies; 21+ messages in thread
From: Peter Tschipper @ 2015-11-11 19:11 UTC (permalink / raw)
To: Marco Pontello; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 9113 bytes --]
If that were true then we wouldn't need to gzip large files before
sending them over the internet. Data compression generally helps
transmission speed as long as the amount of compression is high enough
and the time it takes is low enough to make it worthwhile. On a
corporate LAN it's generally not worthwhile unless you're dealing with
very large files, but over a corporate WAN or the internet where network
latency can be high it is IMO a worthwhile endevor.
On 11/11/2015 10:49 AM, Marco Pontello wrote:
> A random thought: aren't most communication over a data link already
> compressed, at some point?
> When I used a modem, we had the V.42bis protocol. Now, nearly all ADSL
> connections using PPPoE, surely are. And so on.
> I'm not sure another level of generic, data agnostic kind of
> compression will really give us some real-life practical advantage
> over that.
>
> Something that could take advantage of of special knowledge of the
> specific data, instead, would be an entirely different matter.
>
> Just my 2c.
>
> On Wed, Nov 11, 2015 at 7:35 PM, Peter Tschipper via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
> Here are the latest results on compression ratios for the first
> 295,000 blocks, compressionlevel=6. I think there are more than
> enough datapoints for statistical significance.
>
> Results are very much similar to the previous test. I'll work on
> getting a comparison between how much time savings/loss in time
> there is when syncing the blockchains: compressed vs
> uncompressed. Still, I think it's clear that serving up
> compressed blocks, at least historical blocks, will be of benefit
> for those that have bandwidth caps on their internet connections.
>
> The proposal, so far is fairly simple:
> 1) compress blocks with some compression library: currently zlib
> but I can investigate other possiblities
> 2) As a fall back we need to advertise compression as a service.
> That way we can turn off compression AND decompression completely
> if needed.
> 3) Do the compression at the datastream level in the code.
> CDataStream is the obvious place.
>
>
> Test Results:
>
> range = block size range
> ubytes = average size of uncompressed blocks
> cbytes = average size of compressed blocks
> ctime = average time to compress
> dtime = average time to decompress
> cmp_ratio% = compression ratio
> datapoints = number of datapoints taken
>
> range ubytes cbytes ctime dtime cmp_ratio%
> datapoints
> 0-250b 215 189 0.001 0.000 12.40
> 91280
> 250-500b 438 404 0.001 0.000 7.85
> 13217
> 500-1KB 761 701 0.001 0.000
> 7.86 11434
> 1KB-10KB 4149 3547 0.001 0.000 14.51
> 52180
> 10KB-100KB 41934 32604 0.005 0.001 22.25 82890
> 100KB-200KB 146303 108080 0.016 0.001 26.13 29886
> 200KB-300KB 243299 179281 0.025 0.002 26.31 25066
> 300KB-400KB 344636 266177 0.036 0.003 22.77 4956
> 400KB-500KB 463201 356862 0.046 0.004 22.96 3167
> 500KB-600KB 545123 429854 0.056 0.005 21.15 366
> 600KB-700KB 647736 510931 0.065 0.006 21.12 254
> 700KB-800KB 746540 587287 0.073 0.008 21.33 294
> 800KB-900KB 868121 682650 0.087 0.008 21.36 199
> 900KB-1MB 945747 726307 0.091 0.010 23.20 304
>
> On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
>> Comments:
>>
>> 1) cblock seems a reasonable way to extend the protocol. Further
>> wrapping should probably be done at the stream level.
>>
>> 2) zlib has crappy security track record.
>>
>> 3) A fallback path to non-compressed is required, should
>> compression fail or crash.
>>
>> 4) Most blocks and transactions have runs of zeroes and/or highly
>> common bit-patterns, which contributes to useful compression even
>> at smaller sizes. Peter Ts's most recent numbers bear this out.
>> zlib has a dictionary (32K?) which works well with repeated
>> patterns such as those you see with concatenated runs of
>> transactions.
>>
>> 5) LZO should provide much better compression, at a cost of CPU
>> performance and using a less-reviewed, less-field-tested library.
>>
>>
>>
>>
>>
>> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>>
>>
>> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper
>> <peter.tschipper@gmail.com
>> <mailto:peter.tschipper@gmail.com>> wrote:
>>
>> There are better ways of sending new blocks, that's
>> certainly true but for sending historical blocks and
>> seding transactions I don't think so. This PR is really
>> designed to save bandwidth and not intended to be a huge
>> performance improvement in terms of time spent sending.
>>
>>
>> If the main point is for historical data, then sticking to
>> just blocks is the best plan.
>>
>> Since small blocks don't compress well, you could define a
>> "cblocks" message that handles multiple blocks (just
>> concatenate the block messages as payload before compression).
>>
>> The sending peer could combine blocks so that each cblock is
>> compressing at least 10kB of block data (or whatever is
>> optimal). It is probably worth specifying a maximum size for
>> network buffer reasons (either 1MB or 1 block maximum).
>>
>> Similarly, transactions could be combined together and
>> compressed "ctxs". The inv messages could be modified so
>> that you can request groups of 10-20 transactions. That
>> would depend on how much of an improvement compressed
>> transactions would represent.
>>
>> More generally, you could define a message which is a
>> compressed message holder. That is probably to complex to be
>> worth the effort though.
>>
>>
>>
>>>
>>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via
>>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>>
>>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via
>>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>>
>>>
>>> I think 25% bandwidth savings is certainly
>>> considerable, especially for people running full
>>> nodes in countries like Australia where internet
>>> bandwidth is lower and there are data caps.
>>>
>>>
>>> This reinforces the idea that such trade-off
>>> decisions should be be local and negotiated between
>>> peers, not a required feature of the network P2P.
>>>
>>>
>>> --
>>> Johnathan Corgan
>>> Corgan Labs - SDR Training and Development Services
>>> http://corganlabs.com
>>>
>>> _______________________________________________
>>> bitcoin-dev mailing list
>>> bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> bitcoin-dev mailing list
>>> bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
>
> --
> Try the Online TrID File Identifier
> http://mark0.net/onlinetrid.aspx
[-- Attachment #2: Type: text/html, Size: 25638 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's"
2015-11-10 16:46 ` Jeff Garzik
2015-11-10 17:09 ` Peter Tschipper
2015-11-11 18:35 ` Peter Tschipper
@ 2015-11-28 14:48 ` Peter Tschipper
2015-11-29 0:30 ` Jonathan Toomim
2 siblings, 1 reply; 21+ messages in thread
From: Peter Tschipper @ 2015-11-28 14:48 UTC (permalink / raw)
To: bitcoin-dev
[-- Attachment #1: Type: text/plain, Size: 8938 bytes --]
Hi All,
Here are some final results of testing with the reference implementation
for compressing blocks and transactions. This implementation also
concatenates blocks and transactions when possible so you'll see data
sizes in the 1-2MB ranges.
Results below show the time it takes to sync the first part of the
blockchain, comparing Zlib to the LZOx library. (LZOf was also tried
but wasn't found to be as good as LZOx). The following shows tests run
with and without latency. With latency on the network, all compression
libraries performed much better than without compression.
I don't think it's entirely obvious which is better, Zlib or LZO.
Although I prefer the higher compression of Zlib, overall I would have
to give the edge to LZO. With LZO we have the fastest most scalable
option when at the lowest compression setting which will be a boost in
performance for users that want peformance over compression, and then at
the high end LZO provides decent compression which approaches Zlib,
(although at a higher cost) but good for those that want to save more
bandwidth.
Uncompressed 60ms Zlib-1 (60ms) Zlib-6 (60ms) LZOx-1 (60ms) LZOx-999
(60ms)
219 299 296 294 291
432 568 565 558 548
652 835 836 819 811
866 1106 1107 1081 1071
1082 1372 1381 1341 1333
1309 1644 1654 1605 1600
1535 1917 1936 1873 1875
1762 2191 2210 2141 2141
1992 2463 2486 2411 2411
2257 2748 2780 2694 2697
2627 3034 3076 2970 2983
3226 3416 3397 3266 3302
4010 3983 3773 3625 3703
4914 4503 4292 4127 4287
5806 4928 4719 4529 4821
6674 5249 5164 4840 5314
7563 5603 5669 5289 6002
8477 6054 6268 5858 6638
9843 7085 7278 6868 7679
11338 8215 8433 8044 8795
These results from testing on a highspeed wireless LAN (very small latency)
Results in seconds
Num blocks sync'd Uncompressed Zlib-1 Zlib-6 LZOx-1 LZOx-999
10000 255 232 233 231 257
20000 464 414 420 407 453
30000 677 594 611 585 650
40000 887 782 795 760 849
50000 1099 961 977 933 1048
60000 1310 1145 1167 1110 1259
70000 1512 1330 1362 1291 1470
80000 1714 1519 1552 1469 1679
90000 1917 1707 1747 1650 1882
100000 2122 1905 1950 1843 2111
110000 2333 2107 2151 2038 2329
120000 2560 2333 2376 2256 2580
130000 2835 2656 2679 2558 2921
140000 3274 3259 3161 3051 3466
150000 3662 3793 3547 3440 3919
160000 4040 4172 3937 3767 4416
170000 4425 4625 4379 4215 4958
180000 4860 5149 4895 4781 5560
190000 5855 6160 5898 5805 6557
200000 7004 7234 7051 6983 7770
The following show the compression ratio acheived for various sizes of
data. Zlib is the clear
winner for compressibility, with LZOx-999 coming close but at a cost.
range Zlib-1 cmp%
Zlib-6 cmp% LZOx-1 cmp% LZOx-999 cmp%
0-250b 12.44 12.86 10.79 14.34
250-500b 19.33 12.97 10.34 11.11
600-700 16.72 n/a 12.91 17.25
700-800 6.37 7.65 4.83 8.07
900-1KB 6.54 6.95 5.64 7.9
1KB-10KB 25.08 25.65 21.21 22.65
10KB-100KB 19.77 21.57 14.37 19.02
100KB-200KB 21.49 23.56 15.37 21.55
200KB-300KB 23.66 24.18 16.91 22.76
300KB-400KB 23.4 23.7 16.5 21.38
400KB-500KB 24.6 24.85 17.56 22.43
500KB-600KB 25.51 26.55 18.51 23.4
600KB-700KB 27.25 28.41 19.91 25.46
700KB-800KB 27.58 29.18 20.26 27.17
800KB-900KB 27 29.11 20 27.4
900KB-1MB 28.19 29.38 21.15 26.43
1MB -2MB 27.41 29.46 21.33 27.73
The following shows the time in seconds to compress data of various
sizes. LZO1x is the
fastest and as file sizes increase, LZO1x time hardly increases at all.
It's interesing
to note as compression ratios increase LZOx-999 performs much worse than
Zlib. So LZO is faster
on the low end and slower (5 to 6 times slower) on the high end.
range Zlib-1 Zlib-6 LZOx-1 LZOx-999 cmp%
0-250b 0.001 0 0 0
250-500b 0 0 0 0.001
500-1KB 0 0 0 0.001
1KB-10KB 0.001 0.001 0 0.002
10KB-100KB 0.004 0.006 0.001 0.017
100KB-200KB 0.012 0.017 0.002 0.054
200KB-300KB 0.018 0.024 0.003 0.087
300KB-400KB 0.022 0.03 0.003 0.121
400KB-500KB 0.027 0.037 0.004 0.151
500KB-600KB 0.031 0.044 0.004 0.184
600KB-700KB 0.035 0.051 0.006 0.211
700KB-800KB 0.039 0.057 0.006 0.243
800KB-900KB 0.045 0.064 0.006 0.27
900KB-1MB 0.049 0.072 0.006 0.307
On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
> Comments:
>
> 1) cblock seems a reasonable way to extend the protocol. Further
> wrapping should probably be done at the stream level.
>
> 2) zlib has crappy security track record.
>
> 3) A fallback path to non-compressed is required, should compression
> fail or crash.
>
> 4) Most blocks and transactions have runs of zeroes and/or highly
> common bit-patterns, which contributes to useful compression even at
> smaller sizes. Peter Ts's most recent numbers bear this out. zlib
> has a dictionary (32K?) which works well with repeated patterns such
> as those you see with concatenated runs of transactions.
>
> 5) LZO should provide much better compression, at a cost of CPU
> performance and using a less-reviewed, less-field-tested library.
>
>
>
>
>
> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
>
>
> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper
> <peter.tschipper@gmail.com <mailto:peter.tschipper@gmail.com>> wrote:
>
> There are better ways of sending new blocks, that's certainly
> true but for sending historical blocks and seding transactions
> I don't think so. This PR is really designed to save
> bandwidth and not intended to be a huge performance
> improvement in terms of time spent sending.
>
>
> If the main point is for historical data, then sticking to just
> blocks is the best plan.
>
> Since small blocks don't compress well, you could define a
> "cblocks" message that handles multiple blocks (just concatenate
> the block messages as payload before compression).
>
> The sending peer could combine blocks so that each cblock is
> compressing at least 10kB of block data (or whatever is optimal).
> It is probably worth specifying a maximum size for network buffer
> reasons (either 1MB or 1 block maximum).
>
> Similarly, transactions could be combined together and compressed
> "ctxs". The inv messages could be modified so that you can
> request groups of 10-20 transactions. That would depend on how
> much of an improvement compressed transactions would represent.
>
> More generally, you could define a message which is a compressed
> message holder. That is probably to complex to be worth the
> effort though.
>
>
>
>>
>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via
>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>>
>> I think 25% bandwidth savings is certainly
>> considerable, especially for people running full
>> nodes in countries like Australia where internet
>> bandwidth is lower and there are data caps.
>>
>>
>> This reinforces the idea that such trade-off decisions
>> should be be local and negotiated between peers, not a
>> required feature of the network P2P.
>>
>>
>> --
>> Johnathan Corgan
>> Corgan Labs - SDR Training and Development Services
>> http://corganlabs.com
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
[-- Attachment #2: Type: text/html, Size: 47442 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's"
2015-11-28 14:48 ` [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's" Peter Tschipper
@ 2015-11-29 0:30 ` Jonathan Toomim
2015-11-29 5:15 ` Peter Tschipper
0 siblings, 1 reply; 21+ messages in thread
From: Jonathan Toomim @ 2015-11-29 0:30 UTC (permalink / raw)
To: Peter Tschipper; +Cc: bitcoin-dev
[-- Attachment #1.1: Type: text/plain, Size: 758 bytes --]
It appears you're using the term "compression ratio" to mean "size reduction". A compression ratio is the ratio (compressed / uncompressed). A 1 kB file compressed with a 10% compression ratio would be 0.1 kB. It seems you're using (1 - compressed/uncompressed), meaning that the compressed file would be 0.9 kB.
On Nov 28, 2015, at 6:48 AM, Peter Tschipper via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
> The following show the compression ratio acheived for various sizes of data. Zlib is the clear
> winner for compressibility, with LZOx-999 coming close but at a cost.
>
> range Zlib-1 cmp%
> Zlib-6 cmp% LZOx-1 cmp% LZOx-999 cmp%
> 0-250b 12.44 12.86 10.79 14.34
> 250-500b 19.33 12.97 10.34 11.11
>
>
>
>
>
[-- Attachment #1.2: Type: text/html, Size: 3157 bytes --]
[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 496 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's"
2015-11-29 0:30 ` Jonathan Toomim
@ 2015-11-29 5:15 ` Peter Tschipper
0 siblings, 0 replies; 21+ messages in thread
From: Peter Tschipper @ 2015-11-29 5:15 UTC (permalink / raw)
To: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 959 bytes --]
yes, you're right, it's just the percentage compressed (size reduction)
On 28/11/2015 4:30 PM, Jonathan Toomim wrote:
> It appears you're using the term "compression ratio" to mean "size
> reduction". A compression ratio is the ratio (compressed /
> uncompressed). A 1 kB file compressed with a 10% compression ratio
> would be 0.1 kB. It seems you're using (1 - compressed/uncompressed),
> meaning that the compressed file would be 0.9 kB.
>
> On Nov 28, 2015, at 6:48 AM, Peter Tschipper via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
>> The following show the compression ratio acheived for various sizes
>> of data. Zlib is the clear
>> winner for compressibility, with LZOx-999 coming close but at a cost.
>>
>> range Zlib-1 cmp%
>> Zlib-6 cmp% LZOx-1 cmp% LZOx-999 cmp%
>> 0-250b 12.44 12.86 10.79 14.34
>> 250-500b 19.33 12.97 10.34 11.11
>>
>>
>>
>>
>>
>>
>
[-- Attachment #2: Type: text/html, Size: 4850 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread