So, a Bitcoin-specific
compressor can perhaps do significantly better, but is
it a good idea? Let's argue both sides.
Cons:
On the one hand,
Bitcoin-specific compressors will be closely tied to
the contents of messages, which might make it
difficult to change the wire format later on --
changes to the wire format may need corresponding
changes to the compressor. If the compressor cannot
be implemented cleanly, then the protocol-agnostic,
off-the-shelf compressors have a maintainability
edge, which comes at the expense of the compression
ratio.
Another argument is
that compression algorithms of any kind should be
tested thoroughly before inclusion, and brand new
code may lack the maturity required. While this
argument has some merit, all outputs are verified
separately later on during processing, so
compression/decompression errors can potentially be
detected. If the compressor/decompressor can be
structured in a way that isolates bitcoind from
failure (e.g. as a separate process for starters),
this concern can be remedied.
Pros:
The nature of LZ
compressors leads me to believe that much higher
compression ratios are possible by building a
custom, Bitcoin-aware compressor. If I had to guess,
I would venture that compression ratios of 2X or
more are possible in some cases. In some sense, the
"O(1) block propagation" idea that Gavin proposed a
while ago can be seen as extreme example of a
Bitcoin-specific compressor, albeit one that
constrains the order of transactions in a block.
Compression can buy
us some additional throughput at zero cost, modulo
code complexity.
Given the amount of
acrimonious debate over the block size we have all
had to endure, it seems
criminal to leave
potentially free improvements on the table. Even if
the resulting code is
deemed too complex
to include in the production client right now, it
would be good to understand
the potential for
improvement.
How to Do It
If we want to
compress Bitcoin, a programming
challenge/contest would be one of the best ways to
find the best possible, Bitcoin-specific compressor.
This is the kind of self-contained exercise that
bright young
hackers love to tackle. It'd bring in new
programmers into the ecosystem, and many of us would
love to discover the limits of compressibility for
Bitcoin bits on a wire. And the results would be
interesting even if the final compression engine is
not enabled by default, or not even merged.