From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from sog-mx-2.v43.ch3.sourceforge.com ([172.29.43.192] helo=mx.sourceforge.net) by sfs-ml-2.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1Tde1L-0004vR-TU for bitcoin-development@lists.sourceforge.net; Wed, 28 Nov 2012 09:33:31 +0000 Received-SPF: pass (sog-mx-2.v43.ch3.sourceforge.com: domain of petertodd.org designates 62.13.149.82 as permitted sender) client-ip=62.13.149.82; envelope-from=pete@petertodd.org; helo=outmail149082.authsmtp.co.uk; Received: from outmail149082.authsmtp.co.uk ([62.13.149.82]) by sog-mx-2.v43.ch3.sourceforge.com with esmtp (Exim 4.76) id 1Tde1K-0000Nk-Fu for bitcoin-development@lists.sourceforge.net; Wed, 28 Nov 2012 09:33:31 +0000 Received: from mail-c232.authsmtp.com (mail-c232.authsmtp.com [62.13.128.232]) by punt14.authsmtp.com (8.14.2/8.14.2/) with ESMTP id qAS8XD0l061355; Wed, 28 Nov 2012 08:33:13 GMT Received: from savin (206-248-185-49.dsl.teksavvy.com [206.248.185.49]) (authenticated bits=128) by mail.authsmtp.com (8.14.2/8.14.2/) with ESMTP id qAS8X7Pv078322 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Wed, 28 Nov 2012 08:33:10 GMT Date: Wed, 28 Nov 2012 03:33:06 -0500 From: Peter Todd To: Gavin Andresen Message-ID: <20121128083306.GA13919@savin> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="SLDf9lqlvOQaIe6s" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Server-Quench: 3e2ad246-3936-11e2-b10b-0025903375e2 X-AuthReport-Spam: If SPAM / abuse - report it at: http://www.authsmtp.com/abuse X-AuthRoute: OCd2Yg0TA1ZNQRgX IjsJECJaVQIpKltL GxAVKBZePFsRUQkR aAdMdAoUFloCAgsB AmQbW1xeU157WGM7 aQpXcwdZalRPVwB0 VEFWR1pVCwQmQG1i DnYZMl9ycA1Fenw+ ZEJmV3kVXxF4JBJ/ RkhJFmQFY3phaTUd TRJZd1FJcANIexZF aVN4USYPLwdSbGoL NQ4vNDcwO3BTJTpY RgYVKF8UXXNDJjct Qh0EAX0XB0oZQC40 K1QeMFkEG10YNhdQ eXonR1UZOBIJTTVX DkRABjNCYGMMTCcq FR9BNQAA X-Authentic-SMTP: 61633532353630.1019:706 X-AuthFastPath: 0 (Was 255) X-AuthSMTP-Origin: 206.248.185.49/587 X-AuthVirus-Status: No virus detected - but ensure you scan with your own anti-virus system. X-Spam-Score: -1.5 (-) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain -0.0 SPF_PASS SPF: sender matches SPF record X-Headers-End: 1Tde1K-0000Nk-Fu Cc: Bitcoin Dev Subject: Re: [Bitcoin-development] Payment Protocol Proposal: Invoices/Payments/Receipts X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Nov 2012 09:33:32 -0000 --SLDf9lqlvOQaIe6s Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Nov 26, 2012 at 05:37:31PM -0500, Gavin Andresen wrote: > Why not JSON? > ------------- >=20 > Invoice, Payment and Receipt messages could all be JSON-encoded. And > the Javascript Object Signing and Encryption (JOSE) working group at > the IETF has a draft specification for signing JSON data. >=20 > But the spec is non-trivial. Signing JSON data is troublesome because > JSON can encode the same data in multiple ways (whitespace is > insignificant, characters in strings can be represented escaped or > un-escaped, etc.), and the standards committee identified at least one > security-related issue that will require special JSON parsers for > handling JSON-Web-Signed (JWS) data (duplicate keys must be rejected > by the parser, which is more strict than the JSON spec requires). >=20 > A binary message format has none of those complicating issues. Which > encoding format to pick is largely a matter of taste, but Protocol > Buffers is a simple, robust, multi-programming-language, > well-documented, easy-to-work-with, extensible format. I'm not sure this is actually as much of an advantage as you'd expect. I looked into Google Protocol buffers a while back for a timestamping project and unfortunately there are many ways in which the actual binary encoding of a message can differ even if the meaning of the message is the same, just like JSON. First of all while the order in which fields are encoded *should* be written sequentially, parsers are also required to accept the fields in any order. There is also a repeated fields feature where the fields can either be serialized as one packed key-list pair, or multiple key-value(s) pairs; in the latter case the payloads are concatenated. The general case of how to handle a duplicated field that isn't supposed to be repeated seems to be undefined in the standard. Yet at the same time the standard mentions creating messages by concatenating two messages together. Presumably parsers treat that case as an error, but I wouldn't be surprised if that isn't always true. Implementations differ as well. The current Java and C++ implementations write unknown fields in arbitrary order after the sequentially-ordered known fields, while on the other hand the Python implementation simply drops unknown fields entirely. As far as I know no implementation preserves order for unknown fields. Finally, while not a Protocol Buffers specific problem, UTF8 encoded text isn't guaranteed to survive a UTF8-UTFx-UTF8 round trip. Multiple code point sequences can be semanticly identical so you can expect some software to convert one to the other. Similarly lots of languages internally store unicode strings by converting to something like UTF16. One solution is to use one of the normalization forms such as NFKD - an idempotent transformation - although I wouldn't be surprised if normalization itself is complex enough that implementation bugs exist, not to mention the fact that the normalization forms have undergone different versions. I think the best way(1) to handle (most) the above by simply treating the binary message as immutable and never re-serializing a deserialized message, but if you're willing to do that just using JSON isn't unreasonable either. 1) Of course I went off an created Yet Another Binary Serialization for my project, but I'm young and foolish... --=20 'peter'[:-1]@petertodd.org --SLDf9lqlvOQaIe6s Content-Type: application/pgp-signature; name="signature.asc" Content-Description: Digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAEBAgAGBQJQtcxCAAoJEH+rEUJn5PoEvI8H/31hFYAagr400drVuzjgGISy zt5zBnN3Bp8UdZvndoFaIF8Wmw38clR/xMokkTzOK1OpgN1qRrN16tDxxlROMbNY dRfVmsnyun8SUvu+jCTwqTJfnrrv9+8zRDnF2JKsghrBNPisWaxzrImXILjEMJRE Ltw7vMtnICV+l7GFej9UO6wPWzWtuIRXdFhSVSOhVNf6e17mOz6yo8edo1HT/8q5 bGKv0REQryLBWzPkAIhNbA+isTLIBlg6X/e9OVef02rUs5j5wY+fDsKFjEHghVY/ zcdBdLBtzfNy5VkDxc5D7QIx4QXRDtu4X+7eTWzbJl6NC72b5Khgush0kzFddC0= =hSMB -----END PGP SIGNATURE----- --SLDf9lqlvOQaIe6s--