public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
* [Bitcoin-development] var_int ambiguous serialization consequences
@ 2015-02-01  9:33 Tamas Blummer
  2015-02-01 10:44 ` Wladimir
  2015-02-01 15:00 ` Pieter Wuille
  0 siblings, 2 replies; 4+ messages in thread
From: Tamas Blummer @ 2015-02-01  9:33 UTC (permalink / raw)
  To: Bitcoin Dev


[-- Attachment #1.1: Type: text/plain, Size: 1065 bytes --]

I wonder of consequences if var_int is used in its longer than necessary forms (e.g encoding 1 as 0xfd0100 instead of 0x01)

This is already of interest if applying size limit to a block, since transaction count is var_int but is not part of the hashed header or the merkle tree.

It could also be used to create variants of the same transaction message by altered representation of txIn and txout counts, that would remain valid provided signatures validate with the shortest form, as that is created while re-serializing for signature hashing. An implementation that holds mempool by raw message hashes could be tricked to believe that a modified encoded version of the same transaction is a real double spend. One could also mine a valid block with transactions that have a different hash if regularly parsed and re-serialized. An SPV client could be confused by such a transaction as it was present in the merkle tree proof with a different hash than it gets for the tx with its own serialization or from the raw message.

Tamas Blummer
Bits of Proof


[-- Attachment #1.2: Type: text/html, Size: 2410 bytes --]

[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 496 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Bitcoin-development] var_int ambiguous serialization consequences
  2015-02-01  9:33 [Bitcoin-development] var_int ambiguous serialization consequences Tamas Blummer
@ 2015-02-01 10:44 ` Wladimir
  2015-02-01 11:42   ` Tamas Blummer
  2015-02-01 15:00 ` Pieter Wuille
  1 sibling, 1 reply; 4+ messages in thread
From: Wladimir @ 2015-02-01 10:44 UTC (permalink / raw)
  To: Tamas Blummer; +Cc: Bitcoin Dev


On Sun, 1 Feb 2015, Tamas Blummer wrote:

> I wonder of consequences if var_int is used in its longer than necessary forms (e.g encoding 1 as 0xfd0100 instead of 0x01)

In serialize.h lingo you are talking about CompactSize, not VarInt.

CompactSizes indeed have redundancy in their representation, i.e. the same 
number can be represented as up to four different byte sequences.

VARINTs have a different format that (AFAIK) isn't used anywhere in 
the block chain. See WriteVarInt / ReadVarInt. These were designed to 
not have any redundancy in their representation.

> This is already of interest if applying size limit to a block, since transaction count is var_int but is not part of the hashed header or the
> merkle tree.

Are you sure that this is a current concern? Non-canonical CompactSizes 
are forbidden - in serialize.h this is flagged in ReadCompactSize.

Wladimir




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Bitcoin-development] var_int ambiguous serialization consequences
  2015-02-01 10:44 ` Wladimir
@ 2015-02-01 11:42   ` Tamas Blummer
  0 siblings, 0 replies; 4+ messages in thread
From: Tamas Blummer @ 2015-02-01 11:42 UTC (permalink / raw)
  To: Wladimir; +Cc: Bitcoin Dev


[-- Attachment #1.1: Type: text/plain, Size: 1366 bytes --]

Thanks for the clarification. Yes, I referred to CompactSize using the lingo of https://en.bitcoin.it/wiki/Protocol_documentation

I am not sure if it is current concern. Apparently an exception is thrown if non-canonical CompactSize in a transaction s parsed.
Is it ensured that transactions are always parsed before computing their hash?

Tamas Blummer

On Feb 1, 2015, at 11:44 AM, Wladimir <laanwj@gmail.com> wrote:

> 
> On Sun, 1 Feb 2015, Tamas Blummer wrote:
> 
>> I wonder of consequences if var_int is used in its longer than necessary forms (e.g encoding 1 as 0xfd0100 instead of 0x01)
> 
> In serialize.h lingo you are talking about CompactSize, not VarInt.
> 
> CompactSizes indeed have redundancy in their representation, i.e. the same number can be represented as up to four different byte sequences.
> 
> VARINTs have a different format that (AFAIK) isn't used anywhere in the block chain. See WriteVarInt / ReadVarInt. These were designed to not have any redundancy in their representation.
> 
>> This is already of interest if applying size limit to a block, since transaction count is var_int but is not part of the hashed header or the
>> merkle tree.
> 
> Are you sure that this is a current concern? Non-canonical CompactSizes are forbidden - in serialize.h this is flagged in ReadCompactSize.
> 
> Wladimir
> 
> 


[-- Attachment #1.2: Type: text/html, Size: 2636 bytes --]

[-- Attachment #2: Message signed with OpenPGP using GPGMail --]
[-- Type: application/pgp-signature, Size: 496 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Bitcoin-development] var_int ambiguous serialization consequences
  2015-02-01  9:33 [Bitcoin-development] var_int ambiguous serialization consequences Tamas Blummer
  2015-02-01 10:44 ` Wladimir
@ 2015-02-01 15:00 ` Pieter Wuille
  1 sibling, 0 replies; 4+ messages in thread
From: Pieter Wuille @ 2015-02-01 15:00 UTC (permalink / raw)
  To: Tamas Blummer; +Cc: Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 2321 bytes --]

Hashes are always computed by reserializing data structures, never by
hashing wire data directly. This has been the case in every version of the
reference client's code that I know of.

This even meant that for example a block of 999999 bytes with non-shortest
length for the transaction count could be over the mazimum block size, but
still be valid.

As Wladimir says, more recently we switched to just failing to deserialize
(by throwing an exception) whenever a non-shortest form is used.
On Feb 1, 2015 1:34 AM, "Tamas Blummer" <tamas@bitsofproof.com> wrote:

> I wonder of consequences if var_int is used in its longer than necessary
> forms (e.g encoding 1 as 0xfd0100 instead of 0x01)
>
> This is already of interest if applying size limit to a block, since
> transaction count is var_int but is not part of the hashed header or the
> merkle tree.
>
> It could also be used to create variants of the same transaction message
> by altered representation of txIn and txout counts, that would remain valid
> provided signatures validate with the shortest form, as that is created
> while re-serializing for signature hashing. An implementation that holds
> mempool by raw message hashes could be tricked to believe that a modified
> encoded version of the same transaction is a real double spend. One could
> also mine a valid block with transactions that have a different hash if
> regularly parsed and re-serialized. An SPV client could be confused by such
> a transaction as it was present in the merkle tree proof with a different
> hash than it gets for the tx with its own serialization or from the raw
> message.
>
> Tamas Blummer
> Bits of Proof
>
>
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is
> your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>
>

[-- Attachment #2: Type: text/html, Size: 3709 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-02-01 15:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-01  9:33 [Bitcoin-development] var_int ambiguous serialization consequences Tamas Blummer
2015-02-01 10:44 ` Wladimir
2015-02-01 11:42   ` Tamas Blummer
2015-02-01 15:00 ` Pieter Wuille

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox