public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: Andreas Schildbach <andreas@schildbach.de>
To: bitcoin-development@lists.sourceforge.net
Subject: Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Date: Thu, 17 Jul 2014 13:27:57 +0200	[thread overview]
Message-ID: <lq8bvt$l2d$1@ger.gmane.org> (raw)
In-Reply-To: <CANEZrP2=e-JSRjuRgyeGNd2-fvXxEi5t4PAS3BrT-Y7SieywdQ@mail.gmail.com>

Here is a good article that helped me with what's going wrong:

http://www.oracle.com/technetwork/articles/javase/supplementary-142654.html

Basically, Java is stuck at 16 bits per char due to legacy reasons. They
admit that for a new language, they would probably use 32 (or 24?) bits
per char.

\u literals express UTF-16 encoding, so you have to use 16 bits. I
learned that for codepoint 0x010400, I could write "\uD801\uDC00", which
is the UTF-16 encoding of that codepoint.

Other languages have literals for codepoints. E.g. Python can use
u"\U00010400" or HTML has &#x10400;  Unfortunately, Java is missing such
a construct (at least in Java6).


On 07/17/2014 12:59 PM, Mike Hearn wrote:
> Glad we got to the bottom of that. That's quite a nasty
> compiler/language bug I must say. Not even a warning. Still, python
> crashes when trying to print the name of a null character. It wouldn't
> surprise me if there are other weird issues lurking. Would definitely
> sleep better with a more restricted character set.
> 
> On 17 Jul 2014 00:04, "Andreas Schildbach" <andreas@schildbach.de
> <mailto:andreas@schildbach.de>> wrote:
> 
>     Please excuse me. I had a more thorough look at the original problem and
>     found that the only problem with the original test case was that you
>     cannot specify codepoints from the SMP using \u in Java. I always tried
>     \u010400 but that doesn't work.
> 
>     Here is a fix for bitcoinj. The test now passes.
> 
>     https://github.com/bitcoinj/bitcoinj/pull/143
> 
>     We can (and probably should) still need to filter control chars, I'll
>     have a look at that now again.
> 
> 
>     On 07/16/2014 11:06 PM, Aaron Voisine wrote:
>     > If I first remove \u0000, so the non-normalized passphrase is
>     > "\u03D2\u0301\U00010400\U0001F4A9", and then NFC normalize it, it
>     > becomes "\u03D3\U00010400\U0001F4A9"
>     >
>     > UTF-8 encoded this is: 0xcf93f0909080f09f92a9 (not the same as what
>     > you got, Andreas!)
>     >
>     > Encoding private key:
>     5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4
>     > with this passphrase, I get a BIP38 key of:
>     > 6PRW5o9FMb4hAYRQPmgcvVDTyDtr6R17VMXGLmvKjKVpGkYhBJ4uYuR9wZ
>     >
>     > I recommend rather than simply removing control characters from the
>     > password that instead the spec require that passwords containing
>     > control characters are invalid. We don't want people trying to be
>     > clever and putting them in thinking they are adding to the password
>     > entropy.
>     >
>     > Also for UI compatibility across many platforms, I'm also in favor
>     > disallowing any character below U+0020 (space)
>     >
>     > I can submit a PR once we figure out why Andreas's passphrase was
>     > different than what I got.
>     >
>     > Aaron Voisine
>     > breadwallet.com <http://breadwallet.com>
>     >
>     >
>     > On Wed, Jul 16, 2014 at 4:04 AM, Andreas Schildbach
>     > <andreas@schildbach.de <mailto:andreas@schildbach.de>> wrote:
>     >> Damn, I just realized that I implement only the decoding side of
>     BIP38.
>     >> So I cannot propose a complete test vector. Here is what I have:
>     >>
>     >>
>     >> Passphrase: ϓ␀𐐀💩 (\u03D2\u0301\u0000\U00010400\U0001F4A9; GREEK
>     >> UPSILON WITH HOOK, COMBINING ACUTE ACCENT, NULL, DESERET CAPITAL
>     LETTER
>     >> LONG I, PILE OF POO)
>     >>
>     >> Passphrase bytes after removing ISO control characters and NFC
>     >> normalization: 0xcf933034303066346139
>     >>
>     >> Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF
>     >>
>     >> Unencrypted private key (WIF):
>     >> 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4
>     >>
>     >>
>     >> Can someone calculate the encrypted key from it (using whatever
>     >> implementation) and I will verify it decodes properly in bitcoinj?
>     >>
>     >>
>     >>
>     >> On 07/16/2014 12:46 PM, Andreas Schildbach wrote:
>     >>> I will change the bitcoinj implementation and propose a new test
>     vector.
>     >>>
>     >>>
>     >>>
>     >>> On 07/16/2014 11:29 AM, Mike Hearn wrote:
>     >>>> Yes sorry, you're right, the issue starts with the null code point.
>     >>>> Python seems to have problems starting there too. It might work
>     if we
>     >>>> took that out.
>     >>>>
>     >>>>
>     >>>> On Wed, Jul 16, 2014 at 11:17 AM, Andreas Schildbach
>     >>>> <andreas@schildbach.de <mailto:andreas@schildbach.de>
>     <mailto:andreas@schildbach.de <mailto:andreas@schildbach.de>>> wrote:
>     >>>>
>     >>>>     Guys, you are always talking about the Unicode astral
>     plane, but in fact
>     >>>>     its a plain old (ASCII) control character where this
>     problem starts and
>     >>>>     likely ends: \u0000.
>     >>>>
>     >>>>     Let's ban/filter ISO control characters and be done with
>     it. Most
>     >>>>     control characters will never be enterable by any keyboard
>     into a
>     >>>>     password field. Of course I assume that
>     Character.isISOControl() works
>     >>>>     consistently across platforms.
>     >>>>
>     >>>>    
>     http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isISOControl%28char%29
>     >>>>
>     >>>>
>     >>>>     On 07/16/2014 12:23 AM, Aaron Voisine wrote:
>     >>>>     > If the user creates a password on an iOS device with an
>     astral
>     >>>>     > character and then can't enter that password on a JVM
>     wallet, that
>     >>>>     > sucks. If JVMs really can't support unicode NFC then
>     that's a strong
>     >>>>     > case to limit the spec to the subset of unicode that all
>     popular
>     >>>>     > platforms can support, but it sounds like it might just
>     be a JVM
>     >>>>     > string library bug that could hopefully be reported and
>     fixed. I get
>     >>>>     > the same result as in the test case using apple's
>     >>>>     > CFStringNormalize(passphrase, kCFStringNormalizationFormC);
>     >>>>     >
>     >>>>     > Aaron Voisine
>     >>>>     > breadwallet.com <http://breadwallet.com>
>     <http://breadwallet.com>
>     >>>>     >
>     >>>>     >
>     >>>>     > On Tue, Jul 15, 2014 at 11:20 AM, Mike Hearn
>     <mike@plan99.net <mailto:mike@plan99.net>
>     >>>>     <mailto:mike@plan99.net <mailto:mike@plan99.net>>> wrote:
>     >>>>     >> Yes, we know, Andreas' code is indeed doing normalisation.
>     >>>>     >>
>     >>>>     >> However it appears the output bytes end up being
>     different. What
>     >>>>     I get back
>     >>>>     >> is:
>     >>>>     >>
>     >>>>     >> cf930001303430300166346139
>     >>>>     >>
>     >>>>     >> vs
>     >>>>     >>
>     >>>>     >> cf9300f0909080f09f92a9
>     >>>>     >>
>     >>>>     >> from the spec.
>     >>>>     >>
>     >>>>     >> I'm not sure why. It appears this is due to the
>     character from
>     >>>>     the astral
>     >>>>     >> planes. Java is old and uses 16 bit characters
>     internally - it
>     >>>>     wouldn't
>     >>>>     >> surprise me if there's some weirdness that means it
>     doesn't/won't
>     >>>>     support
>     >>>>     >> this kind of thing.
>     >>>>     >>
>     >>>>     >> I recommend instead that any implementation that wishes
>     to be
>     >>>>     compatible
>     >>>>     >> with JVM based wallets (I suspect Android is the same) just
>     >>>>     refuse any
>     >>>>     >> passphrase that includes characters outside the BMP. At
>     least
>     >>>>     unless someone
>     >>>>     >> can find a fix. I somehow doubt this will really hurt
>     anyone.
>     >>>>     >>
>     >>>>     >>
>     >>>>    
>     ------------------------------------------------------------------------------
>     >>>>     >> Want fast and easy access to all the code in your
>     enterprise?
>     >>>>     Index and
>     >>>>     >> search up to 200,000 lines of code with a free copy of
>     Black Duck
>     >>>>     >> Code Sight - the same software that powers the world's
>     largest code
>     >>>>     >> search on Ohloh, the Black Duck Open Hub! Try it now.
>     >>>>     >> http://p.sf.net/sfu/bds
>     >>>>     >> _______________________________________________
>     >>>>     >> Bitcoin-development mailing list
>     >>>>     >> Bitcoin-development@lists.sourceforge.net
>     <mailto:Bitcoin-development@lists.sourceforge.net>
>     >>>>     <mailto:Bitcoin-development@lists.sourceforge.net
>     <mailto:Bitcoin-development@lists.sourceforge.net>>
>     >>>>     >>
>     https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>     >>>>     >>
>     >>>>     >
>     >>>>     >
>     >>>>    
>     ------------------------------------------------------------------------------
>     >>>>     > Want fast and easy access to all the code in your enterprise?
>     >>>>     Index and
>     >>>>     > search up to 200,000 lines of code with a free copy of
>     Black Duck
>     >>>>     > Code Sight - the same software that powers the world's
>     largest code
>     >>>>     > search on Ohloh, the Black Duck Open Hub! Try it now.
>     >>>>     > http://p.sf.net/sfu/bds
>     >>>>     >
>     >>>>
>     >>>>
>     >>>>
>     >>>>    
>     ------------------------------------------------------------------------------
>     >>>>     Want fast and easy access to all the code in your
>     enterprise? Index and
>     >>>>     search up to 200,000 lines of code with a free copy of
>     Black Duck
>     >>>>     Code Sight - the same software that powers the world's
>     largest code
>     >>>>     search on Ohloh, the Black Duck Open Hub! Try it now.
>     >>>>     http://p.sf.net/sfu/bds
>     >>>>     _______________________________________________
>     >>>>     Bitcoin-development mailing list
>     >>>>     Bitcoin-development@lists.sourceforge.net
>     <mailto:Bitcoin-development@lists.sourceforge.net>
>     >>>>     <mailto:Bitcoin-development@lists.sourceforge.net
>     <mailto:Bitcoin-development@lists.sourceforge.net>>
>     >>>>    
>     https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>     >>>>
>     >>>>
>     >>>>
>     >>>>
>     >>>>
>     ------------------------------------------------------------------------------
>     >>>> Want fast and easy access to all the code in your enterprise?
>     Index and
>     >>>> search up to 200,000 lines of code with a free copy of Black Duck
>     >>>> Code Sight - the same software that powers the world's largest code
>     >>>> search on Ohloh, the Black Duck Open Hub! Try it now.
>     >>>> http://p.sf.net/sfu/bds
>     >>>>
>     >>>>
>     >>>>
>     >>>> _______________________________________________
>     >>>> Bitcoin-development mailing list
>     >>>> Bitcoin-development@lists.sourceforge.net
>     <mailto:Bitcoin-development@lists.sourceforge.net>
>     >>>> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>     >>>>
>     >>>
>     >>>
>     >>>
>     >>>
>     ------------------------------------------------------------------------------
>     >>> Want fast and easy access to all the code in your enterprise?
>     Index and
>     >>> search up to 200,000 lines of code with a free copy of Black Duck
>     >>> Code Sight - the same software that powers the world's largest code
>     >>> search on Ohloh, the Black Duck Open Hub! Try it now.
>     >>> http://p.sf.net/sfu/bds
>     >>>
>     >>
>     >>
>     >>
>     >>
>     ------------------------------------------------------------------------------
>     >> Want fast and easy access to all the code in your enterprise?
>     Index and
>     >> search up to 200,000 lines of code with a free copy of Black Duck
>     >> Code Sight - the same software that powers the world's largest code
>     >> search on Ohloh, the Black Duck Open Hub! Try it now.
>     >> http://p.sf.net/sfu/bds
>     >> _______________________________________________
>     >> Bitcoin-development mailing list
>     >> Bitcoin-development@lists.sourceforge.net
>     <mailto:Bitcoin-development@lists.sourceforge.net>
>     >> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>     >
>     >
>     ------------------------------------------------------------------------------
>     > Want fast and easy access to all the code in your enterprise?
>     Index and
>     > search up to 200,000 lines of code with a free copy of Black Duck
>     > Code Sight - the same software that powers the world's largest code
>     > search on Ohloh, the Black Duck Open Hub! Try it now.
>     > http://p.sf.net/sfu/bds
>     > _______________________________________________
>     > Bitcoin-development mailing list
>     > Bitcoin-development@lists.sourceforge.net
>     <mailto:Bitcoin-development@lists.sourceforge.net>
>     > https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>     >
> 
> 
> 
>     ------------------------------------------------------------------------------
>     Want fast and easy access to all the code in your enterprise? Index and
>     search up to 200,000 lines of code with a free copy of Black Duck
>     Code Sight - the same software that powers the world's largest code
>     search on Ohloh, the Black Duck Open Hub! Try it now.
>     http://p.sf.net/sfu/bds
>     _______________________________________________
>     Bitcoin-development mailing list
>     Bitcoin-development@lists.sourceforge.net
>     <mailto:Bitcoin-development@lists.sourceforge.net>
>     https://lists.sourceforge.net/lists/listinfo/bitcoin-development
> 
> 
> 
> ------------------------------------------------------------------------------
> Want fast and easy access to all the code in your enterprise? Index and
> search up to 200,000 lines of code with a free copy of Black Duck
> Code Sight - the same software that powers the world's largest code
> search on Ohloh, the Black Duck Open Hub! Try it now.
> http://p.sf.net/sfu/bds
> 
> 
> 
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
> 





  reply	other threads:[~2014-07-17 11:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-15 12:03 [Bitcoin-development] BIP 38 NFC normalisation issue Mike Hearn
2014-07-15 13:07 ` Eric Winer
2014-07-15 13:19   ` Andreas Schildbach
2014-07-15 13:32     ` Michael Wozniak
2014-07-15 15:13   ` Brooks Boyd
2014-07-15 18:20     ` Mike Hearn
2014-07-15 22:23       ` Aaron Voisine
2014-07-16  9:12         ` Mike Hearn
2014-07-16  9:17         ` Andreas Schildbach
2014-07-16  9:29           ` Mike Hearn
2014-07-16 10:46             ` Andreas Schildbach
2014-07-16 11:04               ` Andreas Schildbach
2014-07-16 21:06                 ` Aaron Voisine
2014-07-16 22:02                   ` Andreas Schildbach
2014-07-16 22:22                     ` Andreas Schildbach
2014-07-17 10:59                     ` Mike Hearn
2014-07-17 11:27                       ` Andreas Schildbach [this message]
2014-07-16 12:38             ` Wladimir
2014-07-15 15:17   ` Jeff Garzik
2014-07-15 15:20     ` Mike Hearn
2014-07-15 15:32     ` Andreas Schildbach
2014-07-15 15:53       ` Jeff Garzik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='lq8bvt$l2d$1@ger.gmane.org' \
    --to=andreas@schildbach.de \
    --cc=bitcoin-development@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox