* [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
@ 2015-11-09 19:18 Peter Tschipper
2015-11-09 20:41 ` Johnathan Corgan
2015-11-09 21:04 ` Bob McElrath
0 siblings, 2 replies; 21+ messages in thread
From: Peter Tschipper @ 2015-11-09 19:18 UTC (permalink / raw)
To: Bitcoin Dev
This is my first time through this process so please bear with me.
I opened a PR #6973 this morning for Zlib Block Compression for block
relay and at the request of @sipa this should have a BIP associated
with it. The idea is simple, to compress the datastream before
sending, initially for blocks only but it could theoretically be done
for transactions as well. Initial results show an average of 20% block
compression and taking 90 milliseconds for a full block (on a very slow
laptop) to compress. The savings will be mostly in terms of less
bandwidth used, but I would expect there to be a small performance gain
during the transmission of the blocks particularly where network latency
is higher.
I think the BIP title, if accepted should be the more generic, "Support
for Datastream Compression" rather than the PR title of "Zlib
Compression for block relay" since it could also be used for
transactions as well at a later time.
Thanks for your time...
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-09 19:18 [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
@ 2015-11-09 20:41 ` Johnathan Corgan
2015-11-09 21:04 ` Bob McElrath
1 sibling, 0 replies; 21+ messages in thread
From: Johnathan Corgan @ 2015-11-09 20:41 UTC (permalink / raw)
To: Peter Tschipper; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 1206 bytes --]
On Mon, Nov 9, 2015 at 11:18 AM, Peter Tschipper via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:
> I opened a PR #6973 this morning for Zlib Block Compression for block
> relay and at the request of @sipa this should have a BIP associated
> with it. The idea is simple, to compress the datastream before
> sending, initially for blocks only but it could theoretically be done
> for transactions as well. Initial results show an average of 20% block
> compression and taking 90 milliseconds for a full block (on a very slow
> laptop) to compress. The savings will be mostly in terms of less
> bandwidth used, but I would expect there to be a small performance gain
> during the transmission of the blocks particularly where network latency
> is higher.
>
The trade-off decisions among bandwidth savings, CPU performance, and
latency are local, and I think it shouldn't be assumed that any particular
node will want to support it. I recommend that if P2P message compression
is implemented, it should be negotiated via the services field at
connection time.
--
Johnathan Corgan
Corgan Labs - SDR Training and Development Services
http://corganlabs.com
[-- Attachment #2: Type: text/html, Size: 1942 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-09 19:18 [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
2015-11-09 20:41 ` Johnathan Corgan
@ 2015-11-09 21:04 ` Bob McElrath
2015-11-10 1:58 ` gladoscc
1 sibling, 1 reply; 21+ messages in thread
From: Bob McElrath @ 2015-11-09 21:04 UTC (permalink / raw)
To: Peter Tschipper; +Cc: Bitcoin Dev
I would expect that since a block contains mostly hashes and crypto signatures,
it would be almost totally incompressible. I just calculated compression ratios:
zlib -15% (file is LARGER)
gzip 28%
bzip2 25%
So zlib compression is right out. How much is ~25% bandwidth savings worth to
people? This seems not worth it to me. :-/
Peter Tschipper via bitcoin-dev [bitcoin-dev@lists.linuxfoundation.org] wrote:
> This is my first time through this process so please bear with me.
>
> I opened a PR #6973 this morning for Zlib Block Compression for block
> relay and at the request of @sipa this should have a BIP associated
> with it. The idea is simple, to compress the datastream before
> sending, initially for blocks only but it could theoretically be done
> for transactions as well. Initial results show an average of 20% block
> compression and taking 90 milliseconds for a full block (on a very slow
> laptop) to compress. The savings will be mostly in terms of less
> bandwidth used, but I would expect there to be a small performance gain
> during the transmission of the blocks particularly where network latency
> is higher.
>
> I think the BIP title, if accepted should be the more generic, "Support
> for Datastream Compression" rather than the PR title of "Zlib
> Compression for block relay" since it could also be used for
> transactions as well at a later time.
>
> Thanks for your time...
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
> !DSPAM:5640ff47206804314022622!
--
Cheers, Bob McElrath
"For every complex problem, there is a solution that is simple, neat, and wrong."
-- H. L. Mencken
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-09 21:04 ` Bob McElrath
@ 2015-11-10 1:58 ` gladoscc
2015-11-10 5:40 ` Johnathan Corgan
0 siblings, 1 reply; 21+ messages in thread
From: gladoscc @ 2015-11-10 1:58 UTC (permalink / raw)
To: Bob McElrath; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 2545 bytes --]
I think 25% bandwidth savings is certainly considerable, especially for
people running full nodes in countries like Australia where internet
bandwidth is lower and there are data caps.
I absolutely would not dismiss 25% compression. gzip and bzip2 compression
is relatively standard, and I'd consider the point of implementation
complexity tradeoff to be somewhere along 5-10%.
On Tue, Nov 10, 2015 at 8:04 AM, Bob McElrath via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:
> I would expect that since a block contains mostly hashes and crypto
> signatures,
> it would be almost totally incompressible. I just calculated compression
> ratios:
>
> zlib -15% (file is LARGER)
> gzip 28%
> bzip2 25%
>
> So zlib compression is right out. How much is ~25% bandwidth savings
> worth to
> people? This seems not worth it to me. :-/
>
> Peter Tschipper via bitcoin-dev [bitcoin-dev@lists.linuxfoundation.org]
> wrote:
> > This is my first time through this process so please bear with me.
> >
> > I opened a PR #6973 this morning for Zlib Block Compression for block
> > relay and at the request of @sipa this should have a BIP associated
> > with it. The idea is simple, to compress the datastream before
> > sending, initially for blocks only but it could theoretically be done
> > for transactions as well. Initial results show an average of 20% block
> > compression and taking 90 milliseconds for a full block (on a very slow
> > laptop) to compress. The savings will be mostly in terms of less
> > bandwidth used, but I would expect there to be a small performance gain
> > during the transmission of the blocks particularly where network latency
> > is higher.
> >
> > I think the BIP title, if accepted should be the more generic, "Support
> > for Datastream Compression" rather than the PR title of "Zlib
> > Compression for block relay" since it could also be used for
> > transactions as well at a later time.
> >
> > Thanks for your time...
> > _______________________________________________
> > bitcoin-dev mailing list
> > bitcoin-dev@lists.linuxfoundation.org
> > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
> >
> >
> > !DSPAM:5640ff47206804314022622!
> --
> Cheers, Bob McElrath
>
> "For every complex problem, there is a solution that is simple, neat, and
> wrong."
> -- H. L. Mencken
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
[-- Attachment #2: Type: text/html, Size: 3649 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-10 1:58 ` gladoscc
@ 2015-11-10 5:40 ` Johnathan Corgan
2015-11-10 9:44 ` Tier Nolan
0 siblings, 1 reply; 21+ messages in thread
From: Johnathan Corgan @ 2015-11-10 5:40 UTC (permalink / raw)
To: gladoscc; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 568 bytes --]
On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:
> I think 25% bandwidth savings is certainly considerable, especially for
> people running full nodes in countries like Australia where internet
> bandwidth is lower and there are data caps.
>
This reinforces the idea that such trade-off decisions should be be local
and negotiated between peers, not a required feature of the network P2P.
--
Johnathan Corgan
Corgan Labs - SDR Training and Development Services
http://corganlabs.com
[-- Attachment #2: Type: text/html, Size: 1329 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-10 5:40 ` Johnathan Corgan
@ 2015-11-10 9:44 ` Tier Nolan
[not found] ` <5642172C.701@gmail.com>
0 siblings, 1 reply; 21+ messages in thread
From: Tier Nolan @ 2015-11-10 9:44 UTC (permalink / raw)
Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 2135 bytes --]
The network protocol is not quite consensus critical, but it is important.
Two implementations of the decompressor might not be bug for bug
compatible. This (potentially) means that a block could be designed that
won't decode properly for some version of the client but would work for
another. This would fork the network.
A "raw" network library is unlikely to have the same problem.
Rather than just compress the stream, you could compress only block
messages only. A new "cblock" message could be created that is a
compressed block. This shouldn't reduce efficiency by much.
If a client fails to decode a cblock, then it can ask for the block to be
re-sent as a standard "block" message.
This means that it is a pure performance improvement. If problems occur,
then the client can just switch back to uncompressed mode for that block.
You should look into the block relay system. This gives a larger
improvement than simply compressing the stream. The main benefit is
latency but it means that actual blocks don't have to be sent, so gives a
potential 50% compression ratio. Normally, a node receives all the
transactions and then those transactions are included later in the block.
On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:
> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev <
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>
>> I think 25% bandwidth savings is certainly considerable, especially for
>> people running full nodes in countries like Australia where internet
>> bandwidth is lower and there are data caps.
>>
>
> This reinforces the idea that such trade-off decisions should be be local
> and negotiated between peers, not a required feature of the network P2P.
>
>
> --
> Johnathan Corgan
> Corgan Labs - SDR Training and Development Services
> http://corganlabs.com
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
[-- Attachment #2: Type: text/html, Size: 3569 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
[not found] ` <5642172C.701@gmail.com>
@ 2015-11-10 16:17 ` Peter Tschipper
2015-11-10 16:21 ` Jonathan Toomim
2015-11-10 16:30 ` Tier Nolan
1 sibling, 1 reply; 21+ messages in thread
From: Peter Tschipper @ 2015-11-10 16:17 UTC (permalink / raw)
To: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 5481 bytes --]
On 10/11/2015 8:11 AM, Peter Tschipper wrote:
> On 10/11/2015 1:44 AM, Tier Nolan via bitcoin-dev wrote:
>> The network protocol is not quite consensus critical, but it is
>> important.
>>
>> Two implementations of the decompressor might not be bug for bug
>> compatible. This (potentially) means that a block could be designed
>> that won't decode properly for some version of the client but would
>> work for another. This would fork the network.
>>
>> A "raw" network library is unlikely to have the same problem.
>>
>> Rather than just compress the stream, you could compress only block
>> messages only. A new "cblock" message could be created that is a
>> compressed block. This shouldn't reduce efficiency by much.
>>
> I chose the more generic datastream compression so we could in the
> future apply to possibly to transactions but currently all that is
> planned, is to compress blocks, and that was really my only original
> intent until I saw that there might be some bandwidth savings for
> transactions as well.
>
> The compression however could be applied to any datastream but is not
> *forced* . Basically it would just be a method call in CDatastream so
> we could do ss.compress and ss.decompress and apply that to blocks and
> possibly transactions if worthwhile and only IF compression is turned
> on. But there is no intend to apply this to every type of message
> since most would be too small to benefit from compression.
>
> Here are some results of using the code in the PR to
> compress/decompress blocks using zlib compression level = 6. This
> data was taken from the first 275K blocks in the mainnet blockchain.
> Clearly once we get past 10KB we get pretty decent compression but
> even below that there is some benefit. I'm still collecting data and
> will get the same for the whole blockchain.
>
> range = block size range
> ubytes = average size of uncompressed blocks
> cbytes = average size of compressed blocks
> ctime = average time to compress
> dtime = average time to decompress
> cmp_ratio% = compression ratio
> datapoints = number of datapoints taken
>
> range ubytes cbytes ctime dtime cmp_ratio% datapoints
> 0-250b 215 189 0.001 0.000 12.41 79498
> 250-500b 440 405 0.001 0.000 7.82 11903
> 500-1KB 762 702 0.001 0.000 7.83 10448
> 1KB-10KB 4166 3561 0.001 0.000 14.51 50572
> 10KB-100KB 40820 31597 0.005 0.001 22.59 75555
> 100KB-200KB 146238 106320 0.015 0.001 27.30 25024
> 200KB-300KB 242913 175482 0.025 0.002 27.76 20450
> 300KB-400KB 343430 251760 0.034 0.003 26.69 2069
> 400KB-500KB 457448 343495 0.045 0.004 24.91 1889
> 500KB-600KB 540736 424255 0.056 0.007 21.54 90
> 600KB-700KB 647851 506888 0.063 0.007 21.76 59
> 700KB-800KB 749513 586551 0.073 0.007 21.74 48
> 800KB-900KB 859439 652166 0.086 0.008 24.12 39
> 900KB-1MB 952333 725191 0.089 0.009 23.85 78
>
>> If a client fails to decode a cblock, then it can ask for the block
>> to be re-sent as a standard "block" message.
> interesting idea.
>>
>> This means that it is a pure performance improvement. If problems
>> occur, then the client can just switch back to uncompressed mode for
>> that block.
>>
>> You should look into the block relay system. This gives a larger
>> improvement than simply compressing the stream. The main benefit is
>> latency but it means that actual blocks don't have to be sent, so
>> gives a potential 50% compression ratio. Normally, a node receives
>> all the transactions and then those transactions are included later
>> in the block.
>>
> There are better ways of sending new blocks, that's certainly true but
> for sending historical blocks and seding transactions I don't think
> so. This PR is really designed to save bandwidth and not intended to
> be a huge performance improvement in terms of time spent sending.
>>
>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org> wrote:
>>
>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>>
>> I think 25% bandwidth savings is certainly considerable,
>> especially for people running full nodes in countries like
>> Australia where internet bandwidth is lower and there are
>> data caps.
>>
>>
>> This reinforces the idea that such trade-off decisions should be
>> be local and negotiated between peers, not a required feature of
>> the network P2P.
>>
>>
>> --
>> Johnathan Corgan
>> Corgan Labs - SDR Training and Development Services
>> http://corganlabs.com
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
[-- Attachment #2: Type: text/html, Size: 11847 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-10 16:17 ` Peter Tschipper
@ 2015-11-10 16:21 ` Jonathan Toomim
0 siblings, 0 replies; 21+ messages in thread
From: Jonathan Toomim @ 2015-11-10 16:21 UTC (permalink / raw)
To: Peter Tschipper; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 128 bytes --]
Quick observation: block transmission would be compress-once, send-multiple-times, which makes the tradeoff a little better.
[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 496 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
[not found] ` <5642172C.701@gmail.com>
2015-11-10 16:17 ` Peter Tschipper
@ 2015-11-10 16:30 ` Tier Nolan
2015-11-10 16:46 ` Jeff Garzik
[not found] ` <56421F1E.4050302@gmail.com>
1 sibling, 2 replies; 21+ messages in thread
From: Tier Nolan @ 2015-11-10 16:30 UTC (permalink / raw)
To: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 2479 bytes --]
On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper <peter.tschipper@gmail.com>
wrote:
> There are better ways of sending new blocks, that's certainly true but for
> sending historical blocks and seding transactions I don't think so. This
> PR is really designed to save bandwidth and not intended to be a huge
> performance improvement in terms of time spent sending.
>
If the main point is for historical data, then sticking to just blocks is
the best plan.
Since small blocks don't compress well, you could define a "cblocks"
message that handles multiple blocks (just concatenate the block messages
as payload before compression).
The sending peer could combine blocks so that each cblock is compressing at
least 10kB of block data (or whatever is optimal). It is probably worth
specifying a maximum size for network buffer reasons (either 1MB or 1 block
maximum).
Similarly, transactions could be combined together and compressed "ctxs".
The inv messages could be modified so that you can request groups of 10-20
transactions. That would depend on how much of an improvement compressed
transactions would represent.
More generally, you could define a message which is a compressed message
holder. That is probably to complex to be worth the effort though.
>
> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via bitcoin-dev <
> <bitcoin-dev@lists.linuxfoundation.org>
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev <
>> bitcoin-dev@lists.linuxfoundation.org> wrote:
>>
>>
>>> I think 25% bandwidth savings is certainly considerable, especially for
>>> people running full nodes in countries like Australia where internet
>>> bandwidth is lower and there are data caps.
>>>
>>
>> This reinforces the idea that such trade-off decisions should be be
>> local and negotiated between peers, not a required feature of the network
>> P2P.
>>
>>
>> --
>> Johnathan Corgan
>> Corgan Labs - SDR Training and Development Services
>> http://corganlabs.com
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>
>
> _______________________________________________
> bitcoin-dev mailing listbitcoin-dev@lists.linuxfoundation.orghttps://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
[-- Attachment #2: Type: text/html, Size: 6099 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-10 16:30 ` Tier Nolan
@ 2015-11-10 16:46 ` Jeff Garzik
2015-11-10 17:09 ` Peter Tschipper
` (2 more replies)
[not found] ` <56421F1E.4050302@gmail.com>
1 sibling, 3 replies; 21+ messages in thread
From: Jeff Garzik @ 2015-11-10 16:46 UTC (permalink / raw)
To: Tier Nolan; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 3626 bytes --]
Comments:
1) cblock seems a reasonable way to extend the protocol. Further wrapping
should probably be done at the stream level.
2) zlib has crappy security track record.
3) A fallback path to non-compressed is required, should compression fail
or crash.
4) Most blocks and transactions have runs of zeroes and/or highly common
bit-patterns, which contributes to useful compression even at smaller
sizes. Peter Ts's most recent numbers bear this out. zlib has a
dictionary (32K?) which works well with repeated patterns such as those you
see with concatenated runs of transactions.
5) LZO should provide much better compression, at a cost of CPU performance
and using a less-reviewed, less-field-tested library.
On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>
> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper <
> peter.tschipper@gmail.com> wrote:
>
>> There are better ways of sending new blocks, that's certainly true but
>> for sending historical blocks and seding transactions I don't think so.
>> This PR is really designed to save bandwidth and not intended to be a huge
>> performance improvement in terms of time spent sending.
>>
>
> If the main point is for historical data, then sticking to just blocks is
> the best plan.
>
> Since small blocks don't compress well, you could define a "cblocks"
> message that handles multiple blocks (just concatenate the block messages
> as payload before compression).
>
> The sending peer could combine blocks so that each cblock is compressing
> at least 10kB of block data (or whatever is optimal). It is probably worth
> specifying a maximum size for network buffer reasons (either 1MB or 1 block
> maximum).
>
> Similarly, transactions could be combined together and compressed "ctxs".
> The inv messages could be modified so that you can request groups of 10-20
> transactions. That would depend on how much of an improvement compressed
> transactions would represent.
>
> More generally, you could define a message which is a compressed message
> holder. That is probably to complex to be worth the effort though.
>
>
>
>>
>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via bitcoin-dev <
>> <bitcoin-dev@lists.linuxfoundation.org>
>> bitcoin-dev@lists.linuxfoundation.org> wrote:
>>
>>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev <
>>> bitcoin-dev@lists.linuxfoundation.org> wrote:
>>>
>>>
>>>> I think 25% bandwidth savings is certainly considerable, especially for
>>>> people running full nodes in countries like Australia where internet
>>>> bandwidth is lower and there are data caps.
>>>>
>>>
>>> This reinforces the idea that such trade-off decisions should be be
>>> local and negotiated between peers, not a required feature of the network
>>> P2P.
>>>
>>>
>>> --
>>> Johnathan Corgan
>>> Corgan Labs - SDR Training and Development Services
>>> http://corganlabs.com
>>>
>>> _______________________________________________
>>> bitcoin-dev mailing list
>>> bitcoin-dev@lists.linuxfoundation.org
>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>>
>>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing listbitcoin-dev@lists.linuxfoundation.orghttps://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
[-- Attachment #2: Type: text/html, Size: 7807 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
[not found] ` <56421F1E.4050302@gmail.com>
@ 2015-11-10 16:46 ` Peter Tschipper
0 siblings, 0 replies; 21+ messages in thread
From: Peter Tschipper @ 2015-11-10 16:46 UTC (permalink / raw)
To: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 3993 bytes --]
On 10/11/2015 8:45 AM, Peter Tschipper wrote:
> On 10/11/2015 8:30 AM, Tier Nolan via bitcoin-dev wrote:
>>
>>
>> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper
>> <peter.tschipper@gmail.com> wrote:
>>
>> There are better ways of sending new blocks, that's certainly
>> true but for sending historical blocks and seding transactions I
>> don't think so. This PR is really designed to save bandwidth and
>> not intended to be a huge performance improvement in terms of
>> time spent sending.
>>
>>
>> If the main point is for historical data, then sticking to just
>> blocks is the best plan.
>>
> at the beginning yes.
>> Since small blocks don't compress well, you could define a "cblocks"
>> message that handles multiple blocks (just concatenate the block
>> messages as payload before compression).
>>
> Small block are rare these days (but plenty of historical block), but
> still they get a 10% compression, not bad and I think worthwhile and
> the time it takes to compress small blocks is less that a millisecond
> so no loss there in time. But still you have a good point and
> something worthy of doing after getting compression to work. I think
> it's wise to keep it simple at first and build on the success later.
>> The sending peer could combine blocks so that each cblock is
>> compressing at least 10kB of block data (or whatever is optimal). It
>> is probably worth specifying a maximum size for network buffer
>> reasons (either 1MB or 1 block maximum).
> Good idea. Same answer as above.
>> Similarly, transactions could be combined together and compressed
>> "ctxs". The inv messages could be modified so that you can request
>> groups of 10-20 transactions. That would depend on how much of an
>> improvement compressed transactions would represent.
>>
> Good idea. Same answer as above.
>> More generally, you could define a message which is a compressed
>> message holder. That is probably to complex to be worth the effort
>> though.
> That's actually pretty easy to do and part of the plan. Sending a
> cmp_block rather than a block makes it all easier to implement. It's
> just a matter of doing pnode->pushmessage("cmp_block",
> compressed_block); and handling the "cmp_block" command string at the
> other end.
>>
>>
>>
>>>
>>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via
>>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>>
>>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev
>>> <bitcoin-dev@lists.linuxfoundation.org> wrote:
>>>
>>>
>>> I think 25% bandwidth savings is certainly considerable,
>>> especially for people running full nodes in countries
>>> like Australia where internet bandwidth is lower and
>>> there are data caps.
>>>
>>>
>>> This reinforces the idea that such trade-off decisions
>>> should be be local and negotiated between peers, not a
>>> required feature of the network P2P.
>>>
>>>
>>> --
>>> Johnathan Corgan
>>> Corgan Labs - SDR Training and Development Services
>>> http://corganlabs.com
>>>
>>> _______________________________________________
>>> bitcoin-dev mailing list
>>> bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> bitcoin-dev mailing list
>>> bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
[-- Attachment #2: Type: text/html, Size: 12736 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-10 16:46 ` Jeff Garzik
@ 2015-11-10 17:09 ` Peter Tschipper
2015-11-11 18:35 ` Peter Tschipper
2015-11-28 14:48 ` [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's" Peter Tschipper
2 siblings, 0 replies; 21+ messages in thread
From: Peter Tschipper @ 2015-11-10 17:09 UTC (permalink / raw)
To: bitcoin-dev
[-- Attachment #1: Type: text/plain, Size: 4893 bytes --]
On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
> Comments:
>
> 1) cblock seems a reasonable way to extend the protocol. Further
> wrapping should probably be done at the stream level.
agreed.
>
> 2) zlib has crappy security track record.
>
Zlib had a bad buffer overflow bug but that was in 2005 and it got a lot
of press at the time. It's was fixed in version 1.2.3...we're on 1.2.8
now. I'm not aware of any other current issues with zlib. Do you have a
citation?
> 3) A fallback path to non-compressed is required, should compression
> fail or crash.
agreed.
>
> 4) Most blocks and transactions have runs of zeroes and/or highly
> common bit-patterns, which contributes to useful compression even at
> smaller sizes. Peter Ts's most recent numbers bear this out. zlib
> has a dictionary (32K?) which works well with repeated patterns such
> as those you see with concatenated runs of transactions.
>
> 5) LZO should provide much better compression, at a cost of CPU
> performance and using a less-reviewed, less-field-tested library.
I don't think LZO will give as good compression here but I will do some
benchmarking when I can.
>
>
>
>
>
> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
>
>
> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper
> <peter.tschipper@gmail.com <mailto:peter.tschipper@gmail.com>> wrote:
>
> There are better ways of sending new blocks, that's certainly
> true but for sending historical blocks and seding transactions
> I don't think so. This PR is really designed to save
> bandwidth and not intended to be a huge performance
> improvement in terms of time spent sending.
>
>
> If the main point is for historical data, then sticking to just
> blocks is the best plan.
>
> Since small blocks don't compress well, you could define a
> "cblocks" message that handles multiple blocks (just concatenate
> the block messages as payload before compression).
>
> The sending peer could combine blocks so that each cblock is
> compressing at least 10kB of block data (or whatever is optimal).
> It is probably worth specifying a maximum size for network buffer
> reasons (either 1MB or 1 block maximum).
>
> Similarly, transactions could be combined together and compressed
> "ctxs". The inv messages could be modified so that you can
> request groups of 10-20 transactions. That would depend on how
> much of an improvement compressed transactions would represent.
>
> More generally, you could define a message which is a compressed
> message holder. That is probably to complex to be worth the
> effort though.
>
>
>
>>
>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via
>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>>
>> I think 25% bandwidth savings is certainly
>> considerable, especially for people running full
>> nodes in countries like Australia where internet
>> bandwidth is lower and there are data caps.
>>
>>
>> This reinforces the idea that such trade-off decisions
>> should be be local and negotiated between peers, not a
>> required feature of the network P2P.
>>
>>
>> --
>> Johnathan Corgan
>> Corgan Labs - SDR Training and Development Services
>> http://corganlabs.com
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
[-- Attachment #2: Type: text/html, Size: 15022 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-10 16:46 ` Jeff Garzik
2015-11-10 17:09 ` Peter Tschipper
@ 2015-11-11 18:35 ` Peter Tschipper
2015-11-11 18:49 ` Marco Pontello
2015-11-28 14:48 ` [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's" Peter Tschipper
2 siblings, 1 reply; 21+ messages in thread
From: Peter Tschipper @ 2015-11-11 18:35 UTC (permalink / raw)
To: bitcoin-dev
[-- Attachment #1: Type: text/plain, Size: 6772 bytes --]
Here are the latest results on compression ratios for the first 295,000
blocks, compressionlevel=6. I think there are more than enough
datapoints for statistical significance.
Results are very much similar to the previous test. I'll work on
getting a comparison between how much time savings/loss in time there is
when syncing the blockchains: compressed vs uncompressed. Still, I
think it's clear that serving up compressed blocks, at least historical
blocks, will be of benefit for those that have bandwidth caps on their
internet connections.
The proposal, so far is fairly simple:
1) compress blocks with some compression library: currently zlib but I
can investigate other possiblities
2) As a fall back we need to advertise compression as a service. That
way we can turn off compression AND decompression completely if needed.
3) Do the compression at the datastream level in the code. CDataStream
is the obvious place.
Test Results:
range = block size range
ubytes = average size of uncompressed blocks
cbytes = average size of compressed blocks
ctime = average time to compress
dtime = average time to decompress
cmp_ratio% = compression ratio
datapoints = number of datapoints taken
range ubytes cbytes ctime dtime cmp_ratio% datapoints
0-250b 215 189 0.001 0.000 12.40 91280
250-500b 438 404 0.001 0.000 7.85 13217
500-1KB 761 701 0.001 0.000 7.86 11434
1KB-10KB 4149 3547 0.001 0.000 14.51 52180
10KB-100KB 41934 32604 0.005 0.001 22.25 82890
100KB-200KB 146303 108080 0.016 0.001 26.13 29886
200KB-300KB 243299 179281 0.025 0.002 26.31 25066
300KB-400KB 344636 266177 0.036 0.003 22.77 4956
400KB-500KB 463201 356862 0.046 0.004 22.96 3167
500KB-600KB 545123 429854 0.056 0.005 21.15 366
600KB-700KB 647736 510931 0.065 0.006 21.12 254
700KB-800KB 746540 587287 0.073 0.008 21.33 294
800KB-900KB 868121 682650 0.087 0.008 21.36 199
900KB-1MB 945747 726307 0.091 0.010 23.20 304
On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
> Comments:
>
> 1) cblock seems a reasonable way to extend the protocol. Further
> wrapping should probably be done at the stream level.
>
> 2) zlib has crappy security track record.
>
> 3) A fallback path to non-compressed is required, should compression
> fail or crash.
>
> 4) Most blocks and transactions have runs of zeroes and/or highly
> common bit-patterns, which contributes to useful compression even at
> smaller sizes. Peter Ts's most recent numbers bear this out. zlib
> has a dictionary (32K?) which works well with repeated patterns such
> as those you see with concatenated runs of transactions.
>
> 5) LZO should provide much better compression, at a cost of CPU
> performance and using a less-reviewed, less-field-tested library.
>
>
>
>
>
> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
>
>
> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper
> <peter.tschipper@gmail.com <mailto:peter.tschipper@gmail.com>> wrote:
>
> There are better ways of sending new blocks, that's certainly
> true but for sending historical blocks and seding transactions
> I don't think so. This PR is really designed to save
> bandwidth and not intended to be a huge performance
> improvement in terms of time spent sending.
>
>
> If the main point is for historical data, then sticking to just
> blocks is the best plan.
>
> Since small blocks don't compress well, you could define a
> "cblocks" message that handles multiple blocks (just concatenate
> the block messages as payload before compression).
>
> The sending peer could combine blocks so that each cblock is
> compressing at least 10kB of block data (or whatever is optimal).
> It is probably worth specifying a maximum size for network buffer
> reasons (either 1MB or 1 block maximum).
>
> Similarly, transactions could be combined together and compressed
> "ctxs". The inv messages could be modified so that you can
> request groups of 10-20 transactions. That would depend on how
> much of an improvement compressed transactions would represent.
>
> More generally, you could define a message which is a compressed
> message holder. That is probably to complex to be worth the
> effort though.
>
>
>
>>
>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via
>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>>
>> I think 25% bandwidth savings is certainly
>> considerable, especially for people running full
>> nodes in countries like Australia where internet
>> bandwidth is lower and there are data caps.
>>
>>
>> This reinforces the idea that such trade-off decisions
>> should be be local and negotiated between peers, not a
>> required feature of the network P2P.
>>
>>
>> --
>> Johnathan Corgan
>> Corgan Labs - SDR Training and Development Services
>> http://corganlabs.com
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
[-- Attachment #2: Type: text/html, Size: 16946 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-11 18:35 ` Peter Tschipper
@ 2015-11-11 18:49 ` Marco Pontello
2015-11-11 19:05 ` Jonathan Toomim
2015-11-11 19:11 ` [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
0 siblings, 2 replies; 21+ messages in thread
From: Marco Pontello @ 2015-11-11 18:49 UTC (permalink / raw)
To: Peter Tschipper; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 7383 bytes --]
A random thought: aren't most communication over a data link already
compressed, at some point?
When I used a modem, we had the V.42bis protocol. Now, nearly all ADSL
connections using PPPoE, surely are. And so on.
I'm not sure another level of generic, data agnostic kind of compression
will really give us some real-life practical advantage over that.
Something that could take advantage of of special knowledge of the specific
data, instead, would be an entirely different matter.
Just my 2c.
On Wed, Nov 11, 2015 at 7:35 PM, Peter Tschipper via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org> wrote:
> Here are the latest results on compression ratios for the first 295,000
> blocks, compressionlevel=6. I think there are more than enough datapoints
> for statistical significance.
>
> Results are very much similar to the previous test. I'll work on getting
> a comparison between how much time savings/loss in time there is when
> syncing the blockchains: compressed vs uncompressed. Still, I think it's
> clear that serving up compressed blocks, at least historical blocks, will
> be of benefit for those that have bandwidth caps on their internet
> connections.
>
> The proposal, so far is fairly simple:
> 1) compress blocks with some compression library: currently zlib but I can
> investigate other possiblities
> 2) As a fall back we need to advertise compression as a service. That way
> we can turn off compression AND decompression completely if needed.
> 3) Do the compression at the datastream level in the code. CDataStream is
> the obvious place.
>
>
> Test Results:
>
> range = block size range
> ubytes = average size of uncompressed blocks
> cbytes = average size of compressed blocks
> ctime = average time to compress
> dtime = average time to decompress
> cmp_ratio% = compression ratio
> datapoints = number of datapoints taken
>
> range ubytes cbytes ctime dtime cmp_ratio% datapoints
> 0-250b 215 189 0.001 0.000 12.40 91280
> 250-500b 438 404 0.001 0.000 7.85 13217
> 500-1KB 761 701 0.001 0.000 7.86
> 11434
> 1KB-10KB 4149 3547 0.001 0.000 14.51 52180
> 10KB-100KB 41934 32604 0.005 0.001 22.25 82890
> 100KB-200KB 146303 108080 0.016 0.001 26.13 29886
> 200KB-300KB 243299 179281 0.025 0.002 26.31 25066
> 300KB-400KB 344636 266177 0.036 0.003 22.77 4956
> 400KB-500KB 463201 356862 0.046 0.004 22.96 3167
> 500KB-600KB 545123 429854 0.056 0.005 21.15 366
> 600KB-700KB 647736 510931 0.065 0.006 21.12 254
> 700KB-800KB 746540 587287 0.073 0.008 21.33 294
> 800KB-900KB 868121 682650 0.087 0.008 21.36 199
> 900KB-1MB 945747 726307 0.091 0.010 23.20 304
>
> On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
>
> Comments:
>
> 1) cblock seems a reasonable way to extend the protocol. Further wrapping
> should probably be done at the stream level.
>
> 2) zlib has crappy security track record.
>
> 3) A fallback path to non-compressed is required, should compression fail
> or crash.
>
> 4) Most blocks and transactions have runs of zeroes and/or highly common
> bit-patterns, which contributes to useful compression even at smaller
> sizes. Peter Ts's most recent numbers bear this out. zlib has a
> dictionary (32K?) which works well with repeated patterns such as those you
> see with concatenated runs of transactions.
>
> 5) LZO should provide much better compression, at a cost of CPU
> performance and using a less-reviewed, less-field-tested library.
>
>
>
>
>
> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev <
> <bitcoin-dev@lists.linuxfoundation.org>
> bitcoin-dev@lists.linuxfoundation.org> wrote:
>
>>
>>
>> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper <
>> <peter.tschipper@gmail.com>peter.tschipper@gmail.com> wrote:
>>
>>> There are better ways of sending new blocks, that's certainly true but
>>> for sending historical blocks and seding transactions I don't think so.
>>> This PR is really designed to save bandwidth and not intended to be a huge
>>> performance improvement in terms of time spent sending.
>>>
>>
>> If the main point is for historical data, then sticking to just blocks is
>> the best plan.
>>
>> Since small blocks don't compress well, you could define a "cblocks"
>> message that handles multiple blocks (just concatenate the block messages
>> as payload before compression).
>>
>> The sending peer could combine blocks so that each cblock is compressing
>> at least 10kB of block data (or whatever is optimal). It is probably worth
>> specifying a maximum size for network buffer reasons (either 1MB or 1 block
>> maximum).
>>
>> Similarly, transactions could be combined together and compressed
>> "ctxs". The inv messages could be modified so that you can request groups
>> of 10-20 transactions. That would depend on how much of an improvement
>> compressed transactions would represent.
>>
>> More generally, you could define a message which is a compressed message
>> holder. That is probably to complex to be worth the effort though.
>>
>>
>>
>>>
>>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via bitcoin-dev <
>>> <bitcoin-dev@lists.linuxfoundation.org>
>>> bitcoin-dev@lists.linuxfoundation.org> wrote:
>>>
>>>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev <
>>>> <bitcoin-dev@lists.linuxfoundation.org>
>>>> bitcoin-dev@lists.linuxfoundation.org> wrote:
>>>>
>>>>
>>>>> I think 25% bandwidth savings is certainly considerable, especially
>>>>> for people running full nodes in countries like Australia where internet
>>>>> bandwidth is lower and there are data caps.
>>>>>
>>>>
>>>> This reinforces the idea that such trade-off decisions should be be
>>>> local and negotiated between peers, not a required feature of the network
>>>> P2P.
>>>>
>>>>
>>>> --
>>>> Johnathan Corgan
>>>> Corgan Labs - SDR Training and Development Services
>>>> <http://corganlabs.com>http://corganlabs.com
>>>>
>>>> _______________________________________________
>>>> bitcoin-dev mailing list
>>>> bitcoin-dev@lists.linuxfoundation.org
>>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>>>
>>>>
>>>
>>>
>>> _______________________________________________
>>> bitcoin-dev mailing listbitcoin-dev@lists.linuxfoundation.orghttps://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>>
>>>
>>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>
>
> _______________________________________________
> bitcoin-dev mailing listbitcoin-dev@lists.linuxfoundation.orghttps://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
--
Try the Online TrID File Identifier
http://mark0.net/onlinetrid.aspx
[-- Attachment #2: Type: text/html, Size: 17296 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-11 18:49 ` Marco Pontello
@ 2015-11-11 19:05 ` Jonathan Toomim
2015-11-13 21:58 ` [bitcoin-dev] Block Compression (Datastream Compression) test results using the PR#6973 compression prototype Peter Tschipper
2015-11-11 19:11 ` [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
1 sibling, 1 reply; 21+ messages in thread
From: Jonathan Toomim @ 2015-11-11 19:05 UTC (permalink / raw)
To: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 870 bytes --]
Data compression adds latency and reduces predictability, so engineers have decided to leave compression to application layers instead of transport layer or lower in order to let the application designer decide what tradeoffs to make.
On Nov 11, 2015, at 10:49 AM, Marco Pontello via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
> A random thought: aren't most communication over a data link already compressed, at some point?
> When I used a modem, we had the V.42bis protocol. Now, nearly all ADSL connections using PPPoE, surely are. And so on.
> I'm not sure another level of generic, data agnostic kind of compression will really give us some real-life practical advantage over that.
>
> Something that could take advantage of of special knowledge of the specific data, instead, would be an entirely different matter.
>
> Just my 2c.
[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 496 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] request BIP number for: "Support for Datastream Compression"
2015-11-11 18:49 ` Marco Pontello
2015-11-11 19:05 ` Jonathan Toomim
@ 2015-11-11 19:11 ` Peter Tschipper
1 sibling, 0 replies; 21+ messages in thread
From: Peter Tschipper @ 2015-11-11 19:11 UTC (permalink / raw)
To: Marco Pontello; +Cc: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 9113 bytes --]
If that were true then we wouldn't need to gzip large files before
sending them over the internet. Data compression generally helps
transmission speed as long as the amount of compression is high enough
and the time it takes is low enough to make it worthwhile. On a
corporate LAN it's generally not worthwhile unless you're dealing with
very large files, but over a corporate WAN or the internet where network
latency can be high it is IMO a worthwhile endevor.
On 11/11/2015 10:49 AM, Marco Pontello wrote:
> A random thought: aren't most communication over a data link already
> compressed, at some point?
> When I used a modem, we had the V.42bis protocol. Now, nearly all ADSL
> connections using PPPoE, surely are. And so on.
> I'm not sure another level of generic, data agnostic kind of
> compression will really give us some real-life practical advantage
> over that.
>
> Something that could take advantage of of special knowledge of the
> specific data, instead, would be an entirely different matter.
>
> Just my 2c.
>
> On Wed, Nov 11, 2015 at 7:35 PM, Peter Tschipper via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
> Here are the latest results on compression ratios for the first
> 295,000 blocks, compressionlevel=6. I think there are more than
> enough datapoints for statistical significance.
>
> Results are very much similar to the previous test. I'll work on
> getting a comparison between how much time savings/loss in time
> there is when syncing the blockchains: compressed vs
> uncompressed. Still, I think it's clear that serving up
> compressed blocks, at least historical blocks, will be of benefit
> for those that have bandwidth caps on their internet connections.
>
> The proposal, so far is fairly simple:
> 1) compress blocks with some compression library: currently zlib
> but I can investigate other possiblities
> 2) As a fall back we need to advertise compression as a service.
> That way we can turn off compression AND decompression completely
> if needed.
> 3) Do the compression at the datastream level in the code.
> CDataStream is the obvious place.
>
>
> Test Results:
>
> range = block size range
> ubytes = average size of uncompressed blocks
> cbytes = average size of compressed blocks
> ctime = average time to compress
> dtime = average time to decompress
> cmp_ratio% = compression ratio
> datapoints = number of datapoints taken
>
> range ubytes cbytes ctime dtime cmp_ratio%
> datapoints
> 0-250b 215 189 0.001 0.000 12.40
> 91280
> 250-500b 438 404 0.001 0.000 7.85
> 13217
> 500-1KB 761 701 0.001 0.000
> 7.86 11434
> 1KB-10KB 4149 3547 0.001 0.000 14.51
> 52180
> 10KB-100KB 41934 32604 0.005 0.001 22.25 82890
> 100KB-200KB 146303 108080 0.016 0.001 26.13 29886
> 200KB-300KB 243299 179281 0.025 0.002 26.31 25066
> 300KB-400KB 344636 266177 0.036 0.003 22.77 4956
> 400KB-500KB 463201 356862 0.046 0.004 22.96 3167
> 500KB-600KB 545123 429854 0.056 0.005 21.15 366
> 600KB-700KB 647736 510931 0.065 0.006 21.12 254
> 700KB-800KB 746540 587287 0.073 0.008 21.33 294
> 800KB-900KB 868121 682650 0.087 0.008 21.36 199
> 900KB-1MB 945747 726307 0.091 0.010 23.20 304
>
> On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
>> Comments:
>>
>> 1) cblock seems a reasonable way to extend the protocol. Further
>> wrapping should probably be done at the stream level.
>>
>> 2) zlib has crappy security track record.
>>
>> 3) A fallback path to non-compressed is required, should
>> compression fail or crash.
>>
>> 4) Most blocks and transactions have runs of zeroes and/or highly
>> common bit-patterns, which contributes to useful compression even
>> at smaller sizes. Peter Ts's most recent numbers bear this out.
>> zlib has a dictionary (32K?) which works well with repeated
>> patterns such as those you see with concatenated runs of
>> transactions.
>>
>> 5) LZO should provide much better compression, at a cost of CPU
>> performance and using a less-reviewed, less-field-tested library.
>>
>>
>>
>>
>>
>> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>>
>>
>> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper
>> <peter.tschipper@gmail.com
>> <mailto:peter.tschipper@gmail.com>> wrote:
>>
>> There are better ways of sending new blocks, that's
>> certainly true but for sending historical blocks and
>> seding transactions I don't think so. This PR is really
>> designed to save bandwidth and not intended to be a huge
>> performance improvement in terms of time spent sending.
>>
>>
>> If the main point is for historical data, then sticking to
>> just blocks is the best plan.
>>
>> Since small blocks don't compress well, you could define a
>> "cblocks" message that handles multiple blocks (just
>> concatenate the block messages as payload before compression).
>>
>> The sending peer could combine blocks so that each cblock is
>> compressing at least 10kB of block data (or whatever is
>> optimal). It is probably worth specifying a maximum size for
>> network buffer reasons (either 1MB or 1 block maximum).
>>
>> Similarly, transactions could be combined together and
>> compressed "ctxs". The inv messages could be modified so
>> that you can request groups of 10-20 transactions. That
>> would depend on how much of an improvement compressed
>> transactions would represent.
>>
>> More generally, you could define a message which is a
>> compressed message holder. That is probably to complex to be
>> worth the effort though.
>>
>>
>>
>>>
>>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via
>>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>>
>>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via
>>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>>
>>>
>>> I think 25% bandwidth savings is certainly
>>> considerable, especially for people running full
>>> nodes in countries like Australia where internet
>>> bandwidth is lower and there are data caps.
>>>
>>>
>>> This reinforces the idea that such trade-off
>>> decisions should be be local and negotiated between
>>> peers, not a required feature of the network P2P.
>>>
>>>
>>> --
>>> Johnathan Corgan
>>> Corgan Labs - SDR Training and Development Services
>>> http://corganlabs.com
>>>
>>> _______________________________________________
>>> bitcoin-dev mailing list
>>> bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> bitcoin-dev mailing list
>>> bitcoin-dev@lists.linuxfoundation.org
>>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
>
> --
> Try the Online TrID File Identifier
> http://mark0.net/onlinetrid.aspx
[-- Attachment #2: Type: text/html, Size: 25638 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* [bitcoin-dev] Block Compression (Datastream Compression) test results using the PR#6973 compression prototype
2015-11-11 19:05 ` Jonathan Toomim
@ 2015-11-13 21:58 ` Peter Tschipper
2015-11-18 14:00 ` [bitcoin-dev] More findings: " Peter Tschipper
0 siblings, 1 reply; 21+ messages in thread
From: Peter Tschipper @ 2015-11-13 21:58 UTC (permalink / raw)
To: bitcoin-dev
[-- Attachment #1: Type: text/plain, Size: 1931 bytes --]
Some further Block Compression tests results that compare performance
when network latency is added to the mix.
Running two nodes, windows 7, compressionlevel=6, syncing the first
200000 blocks from one node to another. Running on a highspeed wireless
LAN with no connections to the outside world.
Network latency was added by using Netbalancer to induce the 30ms and
60ms latencies.
From the data not only are bandwidth savings seen but also a small
performance savings as well. However, the overall the value in
compressing blocks appears to be in terms of saving bandwidth.
I was also surprised to see that there was no real difference in
performance when no latency was present; apparently the time it takes to
compress is about equal to the performance savings in such a situation.
The following results compare the tests in terms of how long it takes to
sync the blockchain, compressed vs uncompressed and with varying latencies.
uncmp = uncompressed
cmp = compressed
num blocks sync'd uncmp (secs) cmp (secs) uncmp 30ms (secs) cmp 30ms
(secs) uncmp 60ms (secs) cmp 60ms (secs)
10000 264 269 265 257 274 275
20000 482 492 479 467 499 497
30000 703 717 693 676 724 724
40000 918 939 902 886 947 944
50000 1140 1157 1114 1094 1171 1167
60000 1362 1380 1329 1310 1400 1395
70000 1583 1597 1547 1526 1637 1627
80000 1810 1817 1767 1745 1872 1862
90000 2031 2036 1985 1958 2109 2098
100000 2257 2260 2223 2184 2385 2355
110000 2553 2486 2478 2422 2755 2696
120000 2800 2724 2849 2771 3345 3254
130000 3078 2994 3356 3257 4125 4006
140000 3442 3365 3979 3870 5032 4904
150000 3803 3729 4586 4464 5928 5797
160000 4148 4075 5168 5034 6801 6661
170000 4509 4479 5768 5619 7711 7557
180000 4947 4924 6389 6227 8653 8479
190000 5858 5855 7302 7107 9768 9566
200000 6980 6969 8469 8220 10944 10724
[-- Attachment #2: Type: text/html, Size: 10768 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* [bitcoin-dev] More findings: Block Compression (Datastream Compression) test results using the PR#6973 compression prototype
2015-11-13 21:58 ` [bitcoin-dev] Block Compression (Datastream Compression) test results using the PR#6973 compression prototype Peter Tschipper
@ 2015-11-18 14:00 ` Peter Tschipper
0 siblings, 0 replies; 21+ messages in thread
From: Peter Tschipper @ 2015-11-18 14:00 UTC (permalink / raw)
To: bitcoin-dev
[-- Attachment #1: Type: text/plain, Size: 5198 bytes --]
Hi all,
I'm still doing a little more investigation before opening up a formal
bip PR, but getting close. Here are some more findings.
After moving the compression from main.cpp to streams.h (CDataStream) it
was a simple matter to add compression to transactions as well. Results
as follows:
range = block size range
ubytes = average size of uncompressed transactions
cbytes = average size of compressed transactions
cmp_ratio% = compression ratio
datapoints = number of datapoints taken
range ubytes cbytes cmp_ratio% datapoints
0-250b 220 227 -3.16 23780
250-500b 356 354 0.68 20882
500-600 534 505 5.29 2772
600-700 653 608 6.95 1853
700-800 757 649 14.22 578
800-900 822 758 7.77 661
900-1KB 954 862 9.69 906
1KB-10KB 2698 2222 17.64 3370
10KB-100KB 15463 12092 21.8 15429
A couple of obvious observations. Transactions don't compress well
below 500 bytes but do very well beyond 1KB where there are a great deal
of those large spam type transactions. However, most transactions
happen to be in the < 500 byte range. So the next step was to appy
bundling, or the creating of a "blob" for those smaller transactions, if
and only if there are multiple tx's in the getdata receive queue for a
peer. Doing that yields some very good compression ratios. Some
examples as follows:
The best one I've seen so far was the following where 175 transactions
were bundled into one blob before being compressed. That yielded a 20%
compression ratio, but that doesn't take into account the savings from
the unneeded 174 message headers (24 bytes each) as well as 174 TCP
ACK's of 52 bytes each which yields and additional 76*174=13224 bytes,
making the overall bandwidth savings 32%, in this particular case.
*2015-11-18 01:09:09.002061 compressed blob from 79890 to 67426 txcount:175*
To be sure, this was an extreme example. Most transaction blobs were in
the 2 to 10 transaction range. Such as the following:
*2015-11-17 21:08:28.469313 compressed blob from 3199 to 2876 txcount:10*
But even here the savings are 10%, far better than the "nothing" we
would get without bundling, but add to that the 76 byte * 9 transaction
savings and we have a total 20% savings in bandwidth for transactions
that otherwise would not be compressible.
The same bundling was applied to blocks and very good compression ratios
are seen when sync'ing the blockchain.
Overall the bundling or blobbing of tx's and blocks seems to be a good
idea for improving bandwith use but also there is a scalability factor
here, when the system is busy, transactions are bundled more often,
compressed, sent faster, keeping message queue and network chatter to a
minimum.
I think I have enough information to put together a formal BIP with the
exception of which compression library to implement. These tests were
done using ZLib but I'll also be running tests in the coming days with
LZO (Jeff Garzik's suggestion) and perhaps Snappy. If there are any
other libraries that people would like me to get results for please let
me know and I'll pick maybe the top 2 or 3 and get results back to the
group.
On 13/11/2015 1:58 PM, Peter Tschipper wrote:
> Some further Block Compression tests results that compare performance
> when network latency is added to the mix.
>
> Running two nodes, windows 7, compressionlevel=6, syncing the first
> 200000 blocks from one node to another. Running on a highspeed
> wireless LAN with no connections to the outside world.
> Network latency was added by using Netbalancer to induce the 30ms and
> 60ms latencies.
>
> From the data not only are bandwidth savings seen but also a small
> performance savings as well. However, the overall the value in
> compressing blocks appears to be in terms of saving bandwidth.
>
> I was also surprised to see that there was no real difference in
> performance when no latency was present; apparently the time it takes
> to compress is about equal to the performance savings in such a situation.
>
>
> The following results compare the tests in terms of how long it takes
> to sync the blockchain, compressed vs uncompressed and with varying
> latencies.
> uncmp = uncompressed
> cmp = compressed
>
> num blocks sync'd uncmp (secs) cmp (secs) uncmp 30ms (secs) cmp
> 30ms (secs) uncmp 60ms (secs) cmp 60ms (secs)
> 10000 264 269 265 257 274 275
> 20000 482 492 479 467 499 497
> 30000 703 717 693 676 724 724
> 40000 918 939 902 886 947 944
> 50000 1140 1157 1114 1094 1171 1167
> 60000 1362 1380 1329 1310 1400 1395
> 70000 1583 1597 1547 1526 1637 1627
> 80000 1810 1817 1767 1745 1872 1862
> 90000 2031 2036 1985 1958 2109 2098
> 100000 2257 2260 2223 2184 2385 2355
> 110000 2553 2486 2478 2422 2755 2696
> 120000 2800 2724 2849 2771 3345 3254
> 130000 3078 2994 3356 3257 4125 4006
> 140000 3442 3365 3979 3870 5032 4904
> 150000 3803 3729 4586 4464 5928 5797
> 160000 4148 4075 5168 5034 6801 6661
> 170000 4509 4479 5768 5619 7711 7557
> 180000 4947 4924 6389 6227 8653 8479
> 190000 5858 5855 7302 7107 9768 9566
> 200000 6980 6969 8469 8220 10944 10724
>
>
[-- Attachment #2: Type: text/html, Size: 18590 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's"
2015-11-10 16:46 ` Jeff Garzik
2015-11-10 17:09 ` Peter Tschipper
2015-11-11 18:35 ` Peter Tschipper
@ 2015-11-28 14:48 ` Peter Tschipper
2015-11-29 0:30 ` Jonathan Toomim
2 siblings, 1 reply; 21+ messages in thread
From: Peter Tschipper @ 2015-11-28 14:48 UTC (permalink / raw)
To: bitcoin-dev
[-- Attachment #1: Type: text/plain, Size: 8938 bytes --]
Hi All,
Here are some final results of testing with the reference implementation
for compressing blocks and transactions. This implementation also
concatenates blocks and transactions when possible so you'll see data
sizes in the 1-2MB ranges.
Results below show the time it takes to sync the first part of the
blockchain, comparing Zlib to the LZOx library. (LZOf was also tried
but wasn't found to be as good as LZOx). The following shows tests run
with and without latency. With latency on the network, all compression
libraries performed much better than without compression.
I don't think it's entirely obvious which is better, Zlib or LZO.
Although I prefer the higher compression of Zlib, overall I would have
to give the edge to LZO. With LZO we have the fastest most scalable
option when at the lowest compression setting which will be a boost in
performance for users that want peformance over compression, and then at
the high end LZO provides decent compression which approaches Zlib,
(although at a higher cost) but good for those that want to save more
bandwidth.
Uncompressed 60ms Zlib-1 (60ms) Zlib-6 (60ms) LZOx-1 (60ms) LZOx-999
(60ms)
219 299 296 294 291
432 568 565 558 548
652 835 836 819 811
866 1106 1107 1081 1071
1082 1372 1381 1341 1333
1309 1644 1654 1605 1600
1535 1917 1936 1873 1875
1762 2191 2210 2141 2141
1992 2463 2486 2411 2411
2257 2748 2780 2694 2697
2627 3034 3076 2970 2983
3226 3416 3397 3266 3302
4010 3983 3773 3625 3703
4914 4503 4292 4127 4287
5806 4928 4719 4529 4821
6674 5249 5164 4840 5314
7563 5603 5669 5289 6002
8477 6054 6268 5858 6638
9843 7085 7278 6868 7679
11338 8215 8433 8044 8795
These results from testing on a highspeed wireless LAN (very small latency)
Results in seconds
Num blocks sync'd Uncompressed Zlib-1 Zlib-6 LZOx-1 LZOx-999
10000 255 232 233 231 257
20000 464 414 420 407 453
30000 677 594 611 585 650
40000 887 782 795 760 849
50000 1099 961 977 933 1048
60000 1310 1145 1167 1110 1259
70000 1512 1330 1362 1291 1470
80000 1714 1519 1552 1469 1679
90000 1917 1707 1747 1650 1882
100000 2122 1905 1950 1843 2111
110000 2333 2107 2151 2038 2329
120000 2560 2333 2376 2256 2580
130000 2835 2656 2679 2558 2921
140000 3274 3259 3161 3051 3466
150000 3662 3793 3547 3440 3919
160000 4040 4172 3937 3767 4416
170000 4425 4625 4379 4215 4958
180000 4860 5149 4895 4781 5560
190000 5855 6160 5898 5805 6557
200000 7004 7234 7051 6983 7770
The following show the compression ratio acheived for various sizes of
data. Zlib is the clear
winner for compressibility, with LZOx-999 coming close but at a cost.
range Zlib-1 cmp%
Zlib-6 cmp% LZOx-1 cmp% LZOx-999 cmp%
0-250b 12.44 12.86 10.79 14.34
250-500b 19.33 12.97 10.34 11.11
600-700 16.72 n/a 12.91 17.25
700-800 6.37 7.65 4.83 8.07
900-1KB 6.54 6.95 5.64 7.9
1KB-10KB 25.08 25.65 21.21 22.65
10KB-100KB 19.77 21.57 14.37 19.02
100KB-200KB 21.49 23.56 15.37 21.55
200KB-300KB 23.66 24.18 16.91 22.76
300KB-400KB 23.4 23.7 16.5 21.38
400KB-500KB 24.6 24.85 17.56 22.43
500KB-600KB 25.51 26.55 18.51 23.4
600KB-700KB 27.25 28.41 19.91 25.46
700KB-800KB 27.58 29.18 20.26 27.17
800KB-900KB 27 29.11 20 27.4
900KB-1MB 28.19 29.38 21.15 26.43
1MB -2MB 27.41 29.46 21.33 27.73
The following shows the time in seconds to compress data of various
sizes. LZO1x is the
fastest and as file sizes increase, LZO1x time hardly increases at all.
It's interesing
to note as compression ratios increase LZOx-999 performs much worse than
Zlib. So LZO is faster
on the low end and slower (5 to 6 times slower) on the high end.
range Zlib-1 Zlib-6 LZOx-1 LZOx-999 cmp%
0-250b 0.001 0 0 0
250-500b 0 0 0 0.001
500-1KB 0 0 0 0.001
1KB-10KB 0.001 0.001 0 0.002
10KB-100KB 0.004 0.006 0.001 0.017
100KB-200KB 0.012 0.017 0.002 0.054
200KB-300KB 0.018 0.024 0.003 0.087
300KB-400KB 0.022 0.03 0.003 0.121
400KB-500KB 0.027 0.037 0.004 0.151
500KB-600KB 0.031 0.044 0.004 0.184
600KB-700KB 0.035 0.051 0.006 0.211
700KB-800KB 0.039 0.057 0.006 0.243
800KB-900KB 0.045 0.064 0.006 0.27
900KB-1MB 0.049 0.072 0.006 0.307
On 10/11/2015 8:46 AM, Jeff Garzik via bitcoin-dev wrote:
> Comments:
>
> 1) cblock seems a reasonable way to extend the protocol. Further
> wrapping should probably be done at the stream level.
>
> 2) zlib has crappy security track record.
>
> 3) A fallback path to non-compressed is required, should compression
> fail or crash.
>
> 4) Most blocks and transactions have runs of zeroes and/or highly
> common bit-patterns, which contributes to useful compression even at
> smaller sizes. Peter Ts's most recent numbers bear this out. zlib
> has a dictionary (32K?) which works well with repeated patterns such
> as those you see with concatenated runs of transactions.
>
> 5) LZO should provide much better compression, at a cost of CPU
> performance and using a less-reviewed, less-field-tested library.
>
>
>
>
>
> On Tue, Nov 10, 2015 at 11:30 AM, Tier Nolan via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
>
>
> On Tue, Nov 10, 2015 at 4:11 PM, Peter Tschipper
> <peter.tschipper@gmail.com <mailto:peter.tschipper@gmail.com>> wrote:
>
> There are better ways of sending new blocks, that's certainly
> true but for sending historical blocks and seding transactions
> I don't think so. This PR is really designed to save
> bandwidth and not intended to be a huge performance
> improvement in terms of time spent sending.
>
>
> If the main point is for historical data, then sticking to just
> blocks is the best plan.
>
> Since small blocks don't compress well, you could define a
> "cblocks" message that handles multiple blocks (just concatenate
> the block messages as payload before compression).
>
> The sending peer could combine blocks so that each cblock is
> compressing at least 10kB of block data (or whatever is optimal).
> It is probably worth specifying a maximum size for network buffer
> reasons (either 1MB or 1 block maximum).
>
> Similarly, transactions could be combined together and compressed
> "ctxs". The inv messages could be modified so that you can
> request groups of 10-20 transactions. That would depend on how
> much of an improvement compressed transactions would represent.
>
> More generally, you could define a message which is a compressed
> message holder. That is probably to complex to be worth the
> effort though.
>
>
>
>>
>> On Tue, Nov 10, 2015 at 5:40 AM, Johnathan Corgan via
>> bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>> On Mon, Nov 9, 2015 at 5:58 PM, gladoscc via bitcoin-dev
>> <bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>>
>>
>> I think 25% bandwidth savings is certainly
>> considerable, especially for people running full
>> nodes in countries like Australia where internet
>> bandwidth is lower and there are data caps.
>>
>>
>> This reinforces the idea that such trade-off decisions
>> should be be local and negotiated between peers, not a
>> required feature of the network P2P.
>>
>>
>> --
>> Johnathan Corgan
>> Corgan Labs - SDR Training and Development Services
>> http://corganlabs.com
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>>
>>
>>
>>
>> _______________________________________________
>> bitcoin-dev mailing list
>> bitcoin-dev@lists.linuxfoundation.org
>> <mailto:bitcoin-dev@lists.linuxfoundation.org>
>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>
>
>
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
[-- Attachment #2: Type: text/html, Size: 47442 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's"
2015-11-28 14:48 ` [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's" Peter Tschipper
@ 2015-11-29 0:30 ` Jonathan Toomim
2015-11-29 5:15 ` Peter Tschipper
0 siblings, 1 reply; 21+ messages in thread
From: Jonathan Toomim @ 2015-11-29 0:30 UTC (permalink / raw)
To: Peter Tschipper; +Cc: bitcoin-dev
[-- Attachment #1.1: Type: text/plain, Size: 758 bytes --]
It appears you're using the term "compression ratio" to mean "size reduction". A compression ratio is the ratio (compressed / uncompressed). A 1 kB file compressed with a 10% compression ratio would be 0.1 kB. It seems you're using (1 - compressed/uncompressed), meaning that the compressed file would be 0.9 kB.
On Nov 28, 2015, at 6:48 AM, Peter Tschipper via bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
> The following show the compression ratio acheived for various sizes of data. Zlib is the clear
> winner for compressibility, with LZOx-999 coming close but at a cost.
>
> range Zlib-1 cmp%
> Zlib-6 cmp% LZOx-1 cmp% LZOx-999 cmp%
> 0-250b 12.44 12.86 10.79 14.34
> 250-500b 19.33 12.97 10.34 11.11
>
>
>
>
>
[-- Attachment #1.2: Type: text/html, Size: 3157 bytes --]
[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 496 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's"
2015-11-29 0:30 ` Jonathan Toomim
@ 2015-11-29 5:15 ` Peter Tschipper
0 siblings, 0 replies; 21+ messages in thread
From: Peter Tschipper @ 2015-11-29 5:15 UTC (permalink / raw)
To: Bitcoin Dev
[-- Attachment #1: Type: text/plain, Size: 959 bytes --]
yes, you're right, it's just the percentage compressed (size reduction)
On 28/11/2015 4:30 PM, Jonathan Toomim wrote:
> It appears you're using the term "compression ratio" to mean "size
> reduction". A compression ratio is the ratio (compressed /
> uncompressed). A 1 kB file compressed with a 10% compression ratio
> would be 0.1 kB. It seems you're using (1 - compressed/uncompressed),
> meaning that the compressed file would be 0.9 kB.
>
> On Nov 28, 2015, at 6:48 AM, Peter Tschipper via bitcoin-dev
> <bitcoin-dev@lists.linuxfoundation.org
> <mailto:bitcoin-dev@lists.linuxfoundation.org>> wrote:
>
>> The following show the compression ratio acheived for various sizes
>> of data. Zlib is the clear
>> winner for compressibility, with LZOx-999 coming close but at a cost.
>>
>> range Zlib-1 cmp%
>> Zlib-6 cmp% LZOx-1 cmp% LZOx-999 cmp%
>> 0-250b 12.44 12.86 10.79 14.34
>> 250-500b 19.33 12.97 10.34 11.11
>>
>>
>>
>>
>>
>>
>
[-- Attachment #2: Type: text/html, Size: 4850 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2015-11-29 5:15 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-09 19:18 [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
2015-11-09 20:41 ` Johnathan Corgan
2015-11-09 21:04 ` Bob McElrath
2015-11-10 1:58 ` gladoscc
2015-11-10 5:40 ` Johnathan Corgan
2015-11-10 9:44 ` Tier Nolan
[not found] ` <5642172C.701@gmail.com>
2015-11-10 16:17 ` Peter Tschipper
2015-11-10 16:21 ` Jonathan Toomim
2015-11-10 16:30 ` Tier Nolan
2015-11-10 16:46 ` Jeff Garzik
2015-11-10 17:09 ` Peter Tschipper
2015-11-11 18:35 ` Peter Tschipper
2015-11-11 18:49 ` Marco Pontello
2015-11-11 19:05 ` Jonathan Toomim
2015-11-13 21:58 ` [bitcoin-dev] Block Compression (Datastream Compression) test results using the PR#6973 compression prototype Peter Tschipper
2015-11-18 14:00 ` [bitcoin-dev] More findings: " Peter Tschipper
2015-11-11 19:11 ` [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
2015-11-28 14:48 ` [bitcoin-dev] further test results for : "Datastream Compression of Blocks and Tx's" Peter Tschipper
2015-11-29 0:30 ` Jonathan Toomim
2015-11-29 5:15 ` Peter Tschipper
[not found] ` <56421F1E.4050302@gmail.com>
2015-11-10 16:46 ` [bitcoin-dev] request BIP number for: "Support for Datastream Compression" Peter Tschipper
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox