[bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support

public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed

* [bitcoin-dev]  BIP- & SLIP-0039 -- better multi-language support
       [not found] <mailman.2299.1542895684.19477.bitcoin-dev@lists.linuxfoundation.org>
@ 2018-11-22 17:25 ` Weiji Guo
  0 siblings, 0 replies; 9+ messages in thread
From: Weiji Guo @ 2018-11-22 17:25 UTC (permalink / raw)
  To: bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 3128 bytes --]

Hi Everyone,

Thank you very much in this thanks giving day for the detailed and well
thought out responses. :)

Steven Hatzakis via bitcoin-dev <bitcoin-dev at
lists.linuxfoundation.org
<https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>>:

>* *Option 2*: Perhaps a revision is needed to how the BIP39 seed is
*>* generated in the first place, such as by hashing the entropy instead of the
*>* words. Any thoughts on how viable that could be where the initial entropy
*>* is fed into the PBKDF2 function and not the words?*

If we go this direction, I'd suggest that we pull Shamir's Secret Sharing
into the game. Trezor's
SLIP-0039 proposal is great and has many security aspects already covered.
However, it does
not allow any language other than English and Trezor team clearly stated
that no other language
will be supported.

While I really want to keep the language independent design. So in the
revision, I'd like to see
a language id (allocated to each one having a defined wordlist) in the SSS
share, as well as
share id, threshold, index, share value, checksum etc.

Regarding checksum scheme, SLIP-0039 proposals a 3-word Reed-Solomon
design. It has a very
good error checking capability but not very good at providing hints to
error recovery. Trezor team
opposes to the idea of providing hints to users regarding how to fix an
error. This could lead to
difficulties for some vendors, and in small probability, confusions to
users (when there is a 2-word
error)

I do agree with Trezor team that it should be users' responsibility to
recover from a detected error.
However, there is a better way than solely rely on checksum. That is, as in
our revision, we can
support mnemonic in multiple languages simultaneously, why don't we use two
languages, or one
language + numbers to check each other? In Steven's example (language id,
share id, etc. skipped)
we could record a SSS share (assuming it is one of the shares just for the
sake of example) like:

>* *In English*: minimum fee sure ticket faculty banana gate purse caught
*>* valley globe shift
*>* *In Spanish*: mercado faja soledad tarea evadir aries gafas peine búho
*>* tumor gerente reja*

Or

>* *In English*: minimum fee sure ticket faculty banana gate purse caught
*>* valley globe shift*

>* Word Indexes: 1128, 676, 1744, 1805, 653, 145, 770, 1396, 291, 1927,
794, 1582*

Then software will have to check checksum as well as to check if words
match each other. For
example, "minimum"'s index value in English wordlist should equal to "
*mercado*"'s in Spanish,
or should equal to 1128.

If any error is detected, combining the checksum value and dual-encoding
information, it is much
easier to figure out which word was handprinted incorrectly.

BTW, it is very error prone to handprint. Some study suggests about 0.9%
per word rate. See
http://panko.shidler.hawaii.edu/HumanErr/Basic.htm

Hotopf [1980]

W sample (written exam). Per word

0.9%

It is important to have an error recovery mechanism easy to understand and
implement.

Thanks,
Weiji

[-- Attachment #2: Type: text/html, Size: 4849 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support
  2018-11-19 19:54 ` Steven Hatzakis
@ 2018-11-20  1:51   ` Natanael
  0 siblings, 0 replies; 9+ messages in thread
From: Natanael @ 2018-11-20  1:51 UTC (permalink / raw)
  To: Steven Hatzakis, Bitcoin Dev

[-- Attachment #1: Type: text/plain, Size: 5972 bytes --]

Den mån 19 nov. 2018 21:21 skrev Steven Hatzakis via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org>:

> Hi Weiji, and Everyone,
>
> I think this is an important topic so sharing my two cents in case in
> helps: It makes sense for users to know that they can't merely just
> translate a word from one language into another and expect the same
> underlying entropy to be mapped, as the wordlists are not the same (i.e.
> words differ at the same index values across languages).
>
> However, while the words for each language cannot translate directly to
> their equivalent in another language, in terms of entropy (bits), the
> underlying entropy is, in fact, the same, when comparing mnemonics
> generated across languages (see English/Spanish comparison below) when
> sourced from the same initial entropy.
>
> Importantly, the entropy is a pre-image of the resulting mnemonic and
> doesn't change as the language changes, where the only changes are to the
> resulting words which depend on the language chosen, for a given entropy
> string. Ideally, the wallet/software should deal with these nuances, I
> don't think the protocol needs any revision (except for how the BIP39 seed
> is derived, perhaps), even if someone made up their own wordlist, as long
> as the wallet/software has a copy of it to map those words to the
> underlying index values, it's *those underlying index values and the
> entropy they map too is what really matters**. *
>
> I fully support the idea for users to back up this pre-image (initial
> entropy) as it can also be used to check the validity of the mnemonic and
> check that it mapped correctly, see Ian Coleman's BIP39 tool which shows
> index values, a feature that I proposed last year and was since
> implemented. Below is an example of how two mnemonics generated with the
> same entropy will produce different BIP39 seeds.
>
> * Example initial entropy of 128 bits +4 bit checksum derived from hash of
> byte array: *
>
> 10001101000 01010100100 11011010000 11100001101 01010001101 00010010001
> 01100000010 10101110100 00100100011 11110000111 01100011010 1100010 (+1110
> checksum)
>
> *In English*: minimum fee sure ticket faculty banana gate purse caught
> valley globe shift
>
> The same initial entropy above (all 132 bits) produces this mnemonic:
>
> *In Spanish*: mercado faja soledad tarea evadir aries gafas peine búho
> tumor gerente reja
>
> And the underlying index values below are the same for both the English
> and Spanish mnemonics above:
>
> Word Indexes: 1128, 676, 1744, 1805, 653, 145, 770, 1396, 291, 1927, 794,
> 1582
>
> *ISSUE AT HAND*:  While the initial entropy is the same, and word indexes
> the same for a given entropy, (i.e. same pre-image), the resulting BIP39
> seed is not the same when comparing the above English mnemonic with its
> Spanish counterpart:
>
>    - *English BIP39 seed:*
>    ce7618075099c89e986f18dc495daa3be190450ed07bef77d4334a54dbc1cd7e205797ffed2615ac0999a5d691f65bf316e2cdbfd2c9d7d90b03e77ff1e6a6f5
>    - *Spanish BIP39 seed*:
>    9f164de0fb09af51b5831886e424d6d2479d49b5e5a1b28f5c09467ea36089b144cd94bb9b636b3c27ccff96a8958e5b7ce43cf1dea81423fc66fa7fef0aea2c
>
>
> *Option 1:* Without changing anything in terms of the entropy
> generation/mapping process in the BIP39 spec, the wallet/client-side
> software would ideally recognize the language and show the corresponding
> index value per wordlist, and reverse-calculate the entropy and then re-map
> it to the language selected.
>
> *Option 2*: Perhaps a revision is needed to how the BIP39 seed is
> generated in the first place, such as by hashing the entropy instead of the
> words. Any thoughts on how viable that could be where the initial entropy
> is fed into the PBKDF2 function and not the words?
>
> *Closing thoughts and tiny checksum nitpick: *
>
>       - The multiple BIP39 seeds per language lend some similarities to
> BIP44 multi-account, so perhaps this can be an advantage, depends on how it
> is applied in UI/UX's (compared to having one BIP39 seed regardless of
> language, for a given initial entropy).
>       - There is perhaps an opportunity to add greater detail to the BIP39
> spec in terms of standards/best-practices for computing checksum values, as
> some software may be hashing bits, versus hashing bytes, or hashing the
> entropy as a hex string, etc.. for a given entropy, which will result in
> different checksum values for the same "valid" mnemonic, that might not be
> "valid" in another wallet which may format the data differently before
> hashing to compute the checksum.
>

This probably wouldn't work as a drop-in replacement, but having the
identifier of the chosen wordlist be part of the mnemonic might work?
Perhaps the raw seed would then be [hash of chosen dictionary]+[sequence of
word indexes].

The user experience then involves always selecting a dictionary by name. I
also suggest maintaining an official list of named dictionaries.

The purpose of including the dictionary in the seed is so that if you use
the last word as a checksum, you also can verify that the dictionary
selection is correct as well as the word sequence.

This allows substitution of words to other languages by manually specifying
a different input dictionary, but you would then have to remember both the
seed language and the translated language so you can specify both
correctly.

The user experience here matches your option 1, while the implementation
matches option 2.

If you remove specification of the seed's original language, you would need
auto detection during entry when the raw seed is just the index. I do not
recommend trying that, especially if any language would end up with
multiple competing dictionaries. Even more so if there's many related
languages which might collide (like all the Latin languages, or even US vs
UK English...).

>

[-- Attachment #2: Type: text/html, Size: 9620 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support
       [not found] <CABsxsG27bJN0vGRJOP3=zriPvkL+G8n3t2nd6Y8L6KwW4ePdeg@mail.gmail.com>
@ 2018-11-19 19:54 ` Steven Hatzakis
  2018-11-20  1:51   ` Natanael
  0 siblings, 1 reply; 9+ messages in thread
From: Steven Hatzakis @ 2018-11-19 19:54 UTC (permalink / raw)
  To: bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 8860 bytes --]

Hi Weiji, and Everyone,

I think this is an important topic so sharing my two cents in case in
helps: It makes sense for users to know that they can't merely just
translate a word from one language into another and expect the same
underlying entropy to be mapped, as the wordlists are not the same (i.e.
words differ at the same index values across languages).

However, while the words for each language cannot translate directly to
their equivalent in another language, in terms of entropy (bits), the
underlying entropy is, in fact, the same, when comparing mnemonics
generated across languages (see English/Spanish comparison below) when
sourced from the same initial entropy.

Importantly, the entropy is a pre-image of the resulting mnemonic and
doesn't change as the language changes, where the only changes are to the
resulting words which depend on the language chosen, for a given entropy
string. Ideally, the wallet/software should deal with these nuances, I
don't think the protocol needs any revision (except for how the BIP39 seed
is derived, perhaps), even if someone made up their own wordlist, as long
as the wallet/software has a copy of it to map those words to the
underlying index values, it's *those underlying index values and the
entropy they map too is what really matters**. *

I fully support the idea for users to back up this pre-image (initial
entropy) as it can also be used to check the validity of the mnemonic and
check that it mapped correctly, see Ian Coleman's BIP39 tool which shows
index values, a feature that I proposed last year and was since
implemented. Below is an example of how two mnemonics generated with the
same entropy will produce different BIP39 seeds.

* Example initial entropy of 128 bits +4 bit checksum derived from hash of
byte array: *

10001101000 01010100100 11011010000 11100001101 01010001101 00010010001
01100000010 10101110100 00100100011 11110000111 01100011010 1100010 (+1110
checksum)

*In English*: minimum fee sure ticket faculty banana gate purse caught
valley globe shift

The same initial entropy above (all 132 bits) produces this mnemonic:

*In Spanish*: mercado faja soledad tarea evadir aries gafas peine búho
tumor gerente reja

And the underlying index values below are the same for both the English and
Spanish mnemonics above:

Word Indexes: 1128, 676, 1744, 1805, 653, 145, 770, 1396, 291, 1927, 794,
1582

*ISSUE AT HAND*:  While the initial entropy is the same, and word indexes
the same for a given entropy, (i.e. same pre-image), the resulting BIP39
seed is not the same when comparing the above English mnemonic with its
Spanish counterpart:

   - *English BIP39 seed:*
   ce7618075099c89e986f18dc495daa3be190450ed07bef77d4334a54dbc1cd7e205797ffed2615ac0999a5d691f65bf316e2cdbfd2c9d7d90b03e77ff1e6a6f5
   - *Spanish BIP39 seed*:
   9f164de0fb09af51b5831886e424d6d2479d49b5e5a1b28f5c09467ea36089b144cd94bb9b636b3c27ccff96a8958e5b7ce43cf1dea81423fc66fa7fef0aea2c

*Option 1:* Without changing anything in terms of the entropy
generation/mapping process in the BIP39 spec, the wallet/client-side
software would ideally recognize the language and show the corresponding
index value per wordlist, and reverse-calculate the entropy and then re-map
it to the language selected.

*Option 2*: Perhaps a revision is needed to how the BIP39 seed is generated
in the first place, such as by hashing the entropy instead of the words.
Any thoughts on how viable that could be where the initial entropy is fed
into the PBKDF2 function and not the words?

*Closing thoughts and tiny checksum nitpick: *

      - The multiple BIP39 seeds per language lend some similarities to
BIP44 multi-account, so perhaps this can be an advantage, depends on how it
is applied in UI/UX's (compared to having one BIP39 seed regardless of
language, for a given initial entropy).
      - There is perhaps an opportunity to add greater detail to the BIP39
spec in terms of standards/best-practices for computing checksum values, as
some software may be hashing bits, versus hashing bytes, or hashing the
entropy as a hex string, etc.. for a given entropy, which will result in
different checksum values for the same "valid" mnemonic, that might not be
"valid" in another wallet which may format the data differently before
hashing to compute the checksum.

Best regards,

Steven Hatzakis

_______________
[bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support*Weiji
Guo* weiji.g
at gmail.com
<bitcoin-dev%40lists.linuxfoundation.org?Subject=Re:%20Re%3A%20%5Bbitcoin-dev%5D%20%20BIP-%20%26%20SLIP-0039%20--%20better%20multi-language%20support&In-Reply-To=%3CCA%2Bydi%3DLM%2Bq-9WKewb%3D65tWCqM1cPMoWEeWq5XAxdqg4rz%3DZJ6g%40mail.gmail.com%3E>
*Tue Nov 6 16:16:41 UTC 2018*

   - Previous message: [bitcoin-dev] draft proposal: change forwarding
   (improved fungibility through wallet interoperability)
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/016469.html>
   - Next message: [bitcoin-dev] Considering starting a toy full-node
   implementation. Any advice?
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/016470.html>
   - *Messages sorted by:* [ date ]
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/date.html#16468>
    [ thread ]
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/thread.html#16468>
    [ subject ]
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/subject.html#16468>
    [ author ]
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/author.html#16468>

------------------------------

Hello everyone,

I just realized that BIP-0039 is language dependent. I was assuming the
other way till I looked closer. The way the seed is derived from a BIP-0039
entropy, as is shown below, depends on which language to generate the
mnemonic sentence:

   Entropy <=> Mnemonic Sentence => PBKDF2 => BIP-0032 Seed

Therefore when a user choose a non-English mnemonic code he or she is stuck
with that language. Meanwhile only a few native languages are supported.

SLIP-0039 does not solve this issue in a user friendly way by providing
only an English wordlist. That's understandable as it aims to provide SSS
capability. However those users who do not speak English or recognize
English words will suffer.

What I am trying to bring to attention of the community is that, no matter
if we make a new version of BIP-0039, or a new BIP (with SSS support), or
to enhance SLIP-0039, we really need to address this language issue.

Here are what I propose:

1. The mnemonic code should be only a representation of underlying entropy
or (pre) master secret, seed, whatever. In this way, the same seed/secret
could be displayed in English or in Chinese or other languages. Then there
could be 3rd party conversion tools to support translations in case any
wallet software or device does not support all specified languages. Now it
looks like:

   Mnemonic Sentence <=> Entropy => PBKDF2 => BIP-0032 Seed

2. Given that only 8 languages are supported in BIP-0039, we should allow
the seed/secret to be represented in decimal numbers, each ranging from 0
to 2047. So those who cannot find a native language support yet having
difficulty coping words in other languages could choose to just use numbers.

So far I don't have a preference how this should be implemented. I'd like
to hear from community first.

Thanks,

Weiji Guo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20181107/88677440/attachment.html>

------------------------------

   - Previous message: [bitcoin-dev] draft proposal: change forwarding
   (improved fungibility through wallet interoperability)
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/016469.html>
   - Next message: [bitcoin-dev] Considering starting a toy full-node
   implementation. Any advice?
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/016470.html>
   - *Messages sorted by:* [ date ]
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/date.html#16468>
    [ thread ]
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/thread.html#16468>
    [ subject ]
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/subject.html#16468>
    [ author ]
   <https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-November/author.html#16468>

------------------------------
More information about the bitcoin-dev mailing list
<https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev>

[-- Attachment #2: Type: text/html, Size: 14223 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support
  2018-11-16 14:05     ` Jonathan Underwood
@ 2018-11-16 14:15       ` Neill Miller
  0 siblings, 0 replies; 9+ messages in thread
From: Neill Miller @ 2018-11-16 14:15 UTC (permalink / raw)
  To: Jonathan Underwood; +Cc: bitcoin-dev, somber.night

Ah, ok.  I've worked with the non-BIP39 Electrum mnemonics, which have
this behaviour, but haven't tried the BIP39 support within it.

Thanks,
-Neill.

On Fri, Nov 16, 2018 at 11:05:50PM +0900, Jonathan Underwood wrote:
> Nope.
> 
> This is how Electrum treats BIP39 restoring as well, try it out.
> 
> -Jon
> 
> 2018年11月16日(金) 23:04、Neill Miller さん（neillm@thecodefactory.org）のメッセージ:
> 
> > On Fri, Nov 09, 2018 at 02:17:30PM +0900, Jonathan Underwood via
> > bitcoin-dev wrote:
> > > If more apps would implement to the word of the BIP39 spec, multiple
> > > languages make sense, but since reality is no one follows the spec (/the
> > > spec is way too open to interpretation) then expecting every app to load
> > > every language is unreasonable.
> > >
> > > Electrum actually handles BIP39 recovery the way the BIP specifies. I can
> > > restore random strings if I want, and it warns me, and I can ignore it
> > if I
> > > wish.
> >
> > Electrum mnemonics are not based on BIP39, which is why it can do
> > this.
> >
> > -Neill.
> >
> >


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support
  2018-11-16 14:03   ` Neill Miller
@ 2018-11-16 14:05     ` Jonathan Underwood
  2018-11-16 14:15       ` Neill Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Jonathan Underwood @ 2018-11-16 14:05 UTC (permalink / raw)
  To: Neill Miller; +Cc: bitcoin-dev, somber.night

[-- Attachment #1: Type: text/plain, Size: 826 bytes --]

Nope.

This is how Electrum treats BIP39 restoring as well, try it out.

-Jon

2018年11月16日(金) 23:04、Neill Miller さん（neillm@thecodefactory.org）のメッセージ:

> On Fri, Nov 09, 2018 at 02:17:30PM +0900, Jonathan Underwood via
> bitcoin-dev wrote:
> > If more apps would implement to the word of the BIP39 spec, multiple
> > languages make sense, but since reality is no one follows the spec (/the
> > spec is way too open to interpretation) then expecting every app to load
> > every language is unreasonable.
> >
> > Electrum actually handles BIP39 recovery the way the BIP specifies. I can
> > restore random strings if I want, and it warns me, and I can ignore it
> if I
> > wish.
>
> Electrum mnemonics are not based on BIP39, which is why it can do
> this.
>
> -Neill.
>
>

[-- Attachment #2: Type: text/html, Size: 1201 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support
  2018-11-09  5:17 ` Jonathan Underwood
@ 2018-11-16 14:03   ` Neill Miller
  2018-11-16 14:05     ` Jonathan Underwood
  0 siblings, 1 reply; 9+ messages in thread
From: Neill Miller @ 2018-11-16 14:03 UTC (permalink / raw)
  To: Jonathan Underwood, Bitcoin Protocol Discussion; +Cc: somber.night

On Fri, Nov 09, 2018 at 02:17:30PM +0900, Jonathan Underwood via bitcoin-dev wrote:
> If more apps would implement to the word of the BIP39 spec, multiple
> languages make sense, but since reality is no one follows the spec (/the
> spec is way too open to interpretation) then expecting every app to load
> every language is unreasonable.
> 
> Electrum actually handles BIP39 recovery the way the BIP specifies. I can
> restore random strings if I want, and it warns me, and I can ignore it if I
> wish.

Electrum mnemonics are not based on BIP39, which is why it can do
this.

-Neill.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support
  2018-11-08 10:37 SomberNight
@ 2018-11-09  5:17 ` Jonathan Underwood
  2018-11-16 14:03   ` Neill Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Jonathan Underwood @ 2018-11-09  5:17 UTC (permalink / raw)
  To: somber.night, bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 7307 bytes --]

>  as it seems bad design to have to fix and maintain a wordlist for every
language as the checksum depends on it.

From BIP39:

> The conversion of the mnemonic sentence to a binary seed is completely
independent from generating the sentence. This results in rather simple
code; there are no constraints on sentence structure and clients are free
to implement their own wordlists or even whole sentence generators,
allowing for flexibility in wordlists for typo detection or other purposes.
>
> Although using a mnemonic not generated by the algorithm described in
"Generating the mnemonic" section is possible, this is not advised and
software must compute a checksum for the mnemonic sentence using a wordlist
and issue a warning if it is invalid.

So BIP39 states "no constraints on sentence structure and clients are free
to implement their own wordlists or even whole sentence generators" and yet
at the same time one paragraph later "this is not advised and software must
compute a checksum for the mnemonic sentence using a wordlist and issue a
warning if it is invalid"...

My interpretation of this:

1. ChecksumCheck function attempts to 1. find the wordlist 2. calculate the
checksum.
2. If it fails to find the wordlist, return false
3. If the checksum doesn't match return false
4. If ChecksumCheck returns false, "issue a warning" but do not block seed
generation. "We couldn't check if your phrase is correct... you're on your
own"

99.99% of implementing apps interpretation: (remember, error handling for
userspace is not done by the BIP39 library, but the app that uses it)

1. Run ChecksumCheck
2. If False, hard fail, do not allow seed generation.

If more apps would implement to the word of the BIP39 spec, multiple
languages make sense, but since reality is no one follows the spec (/the
spec is way too open to interpretation) then expecting every app to load
every language is unreasonable.

Electrum actually handles BIP39 recovery the way the BIP specifies. I can
restore random strings if I want, and it warns me, and I can ignore it if I
wish.


Anywho. The BIP39 multi-language feature is crucial for non-English
speakers especially from Asia. Maybe northern Europeans have no problem
with English word spelling, but watching a normal Japanese person write
down their English mnemonic is painful.

One letter at a time, worried they wrote it wrong... still make mistakes...
lose money because of it.

Whereas users of Copay etc. that support Japanese wordlist write down their
seed easily, and I have never heard of a Japanese newbie complaining about
"but I'm writing it just how I have it written down" about their Japanese
seed... only English.

Not trying to give anyone a hard time, just telling the facts: lack of
localized words for recovery phrase causes more money loss than supporting
it. (When push comes to shove, at the very least Electrum will always
support their recovery because it lets you hash anything)

This is all anecdotal of course. Just sharing my experience evangelizing in
Japan.

Thanks,
Jon


2018年11月8日(木) 21:16 SomberNight via bitcoin-dev <
bitcoin-dev@lists.linuxfoundation.org>:

> Do you specifically want to support changing the language of seed
> words, while keeping the bip32 root seed they generate unchanged?
> What is the usecase for this?
>
> You mention that BIP39 already supports a few different languages.
> While this is true, many (I would guess most!) wallets only
> support the English wordlist.
> There are doubts even from the authors of the BIP whether it was
> a good idea in the first place to support multiple languages [0].
> I don't find this surprising as it seems bad design to have to fix and
> maintain a wordlist for every language as the checksum depends on it.
> The supported wordlists are effectively a part of the specification,
> and every new list would just make that specification larger.
>
> If changing the language of seeds is not a requirement, then look
> into Electrum seeds. They are language/wordlist agnostic.
>
> Mnemonic Sentence => PBKDF2 => BIP-0032 Seed
>
> The bip32 seed is derived by hashing the normalized mnemonic, and the
> checksum is derived the same way but by using a different cheaper
> hash (single round of HMAC-SHA512; generation grinds until it matches
> a pattern) [1]. For example, "9dk" is a valid segwit electrum seed.
>
>
> [0]:
> https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-January/015507.html
> [1]: http://docs.electrum.org/en/latest/seedphrase.html
>
>
> > Date: Wed, 7 Nov 2018 00:16:41 +0800
> > From: Weiji Guo weiji.g@gmail.com
> > Subject: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language
> > support
> >
> > Hello everyone,
> >
> > I just realized that BIP-0039 is language dependent. I was assuming the
> > other way till I looked closer. The way the seed is derived from a
> BIP-0039
> > entropy, as is shown below, depends on which language to generate the
> > mnemonic sentence:
> >
> > Entropy <=> Mnemonic Sentence => PBKDF2 => BIP-0032 Seed
> >
> > Therefore when a user choose a non-English mnemonic code he or she is
> stuck
> > with that language. Meanwhile only a few native languages are supported.
> >
> > SLIP-0039 does not solve this issue in a user friendly way by providing
> > only an English wordlist. That's understandable as it aims to provide SSS
> > capability. However those users who do not speak English or recognize
> > English words will suffer.
> >
> > What I am trying to bring to attention of the community is that, no
> matter
> > if we make a new version of BIP-0039, or a new BIP (with SSS support), or
> > to enhance SLIP-0039, we really need to address this language issue.
> >
> > Here are what I propose:
> >
> > 1.  The mnemonic code should be only a representation of underlying
> entropy
> >     or (pre) master secret, seed, whatever. In this way, the same
> seed/secret
> >     could be displayed in English or in Chinese or other languages. Then
> there
> >     could be 3rd party conversion tools to support translations in case
> any
> >     wallet software or device does not support all specified languages.
> Now it
> >     looks like:
> >
> >     Mnemonic Sentence <=> Entropy => PBKDF2 => BIP-0032 Seed
> >
> >
> > 2. Given that only 8 languages are supported in BIP-0039, we should allow
> > the seed/secret to be represented in decimal numbers, each ranging from 0
> > to 2047. So those who cannot find a native language support yet having
> > difficulty coping words in other languages could choose to just use
> numbers.
> >
> > So far I don't have a preference how this should be implemented. I'd like
> > to hear from community first.
>
> _______________________________________________
> bitcoin-dev mailing list
> bitcoin-dev@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev
>


-- 
-----------------
Jonathan Underwood
ビットバンク社 チーフビットコインオフィサー
-----------------

暗号化したメッセージをお送りの方は下記の公開鍵をご利用下さい。

指紋: 0xCE5EA9476DE7D3E45EBC3FDAD998682F3590FEA3

[-- Attachment #2: Type: text/html, Size: 9151 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support
@ 2018-11-08 10:37 SomberNight
  2018-11-09  5:17 ` Jonathan Underwood
  0 siblings, 1 reply; 9+ messages in thread
From: SomberNight @ 2018-11-08 10:37 UTC (permalink / raw)
  To: bitcoin-dev

Do you specifically want to support changing the language of seed
words, while keeping the bip32 root seed they generate unchanged?
What is the usecase for this?

You mention that BIP39 already supports a few different languages.
While this is true, many (I would guess most!) wallets only
support the English wordlist.
There are doubts even from the authors of the BIP whether it was
a good idea in the first place to support multiple languages [0].
I don't find this surprising as it seems bad design to have to fix and
maintain a wordlist for every language as the checksum depends on it.
The supported wordlists are effectively a part of the specification,
and every new list would just make that specification larger.

If changing the language of seeds is not a requirement, then look
into Electrum seeds. They are language/wordlist agnostic.

Mnemonic Sentence => PBKDF2 => BIP-0032 Seed

The bip32 seed is derived by hashing the normalized mnemonic, and the
checksum is derived the same way but by using a different cheaper
hash (single round of HMAC-SHA512; generation grinds until it matches
a pattern) [1]. For example, "9dk" is a valid segwit electrum seed.

[0]: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2018-January/015507.html
[1]: http://docs.electrum.org/en/latest/seedphrase.html

> Date: Wed, 7 Nov 2018 00:16:41 +0800
> From: Weiji Guo weiji.g@gmail.com
> Subject: [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language
> support
>
> Hello everyone,
>
> I just realized that BIP-0039 is language dependent. I was assuming the
> other way till I looked closer. The way the seed is derived from a BIP-0039
> entropy, as is shown below, depends on which language to generate the
> mnemonic sentence:
>
> Entropy <=> Mnemonic Sentence => PBKDF2 => BIP-0032 Seed
>
> Therefore when a user choose a non-English mnemonic code he or she is stuck
> with that language. Meanwhile only a few native languages are supported.
>
> SLIP-0039 does not solve this issue in a user friendly way by providing
> only an English wordlist. That's understandable as it aims to provide SSS
> capability. However those users who do not speak English or recognize
> English words will suffer.
>
> What I am trying to bring to attention of the community is that, no matter
> if we make a new version of BIP-0039, or a new BIP (with SSS support), or
> to enhance SLIP-0039, we really need to address this language issue.
>
> Here are what I propose:
>
> 1.  The mnemonic code should be only a representation of underlying entropy
>     or (pre) master secret, seed, whatever. In this way, the same seed/secret
>     could be displayed in English or in Chinese or other languages. Then there
>     could be 3rd party conversion tools to support translations in case any
>     wallet software or device does not support all specified languages. Now it
>     looks like:
>
>     Mnemonic Sentence <=> Entropy => PBKDF2 => BIP-0032 Seed
>
>
> 2. Given that only 8 languages are supported in BIP-0039, we should allow
> the seed/secret to be represented in decimal numbers, each ranging from 0
> to 2047. So those who cannot find a native language support yet having
> difficulty coping words in other languages could choose to just use numbers.
>
> So far I don't have a preference how this should be implemented. I'd like
> to hear from community first.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [bitcoin-dev]  BIP- & SLIP-0039 -- better multi-language support
       [not found] <mailman.1087.1541036387.10264.bitcoin-dev@lists.linuxfoundation.org>
@ 2018-11-06 16:16 ` Weiji Guo
  0 siblings, 0 replies; 9+ messages in thread
From: Weiji Guo @ 2018-11-06 16:16 UTC (permalink / raw)
  To: bitcoin-dev

[-- Attachment #1: Type: text/plain, Size: 1821 bytes --]

Hello everyone,

I just realized that BIP-0039 is language dependent. I was assuming the
other way till I looked closer. The way the seed is derived from a BIP-0039
entropy, as is shown below, depends on which language to generate the
mnemonic sentence:

   Entropy <=> Mnemonic Sentence => PBKDF2 => BIP-0032 Seed

Therefore when a user choose a non-English mnemonic code he or she is stuck
with that language. Meanwhile only a few native languages are supported.

SLIP-0039 does not solve this issue in a user friendly way by providing
only an English wordlist. That's understandable as it aims to provide SSS
capability. However those users who do not speak English or recognize
English words will suffer.

What I am trying to bring to attention of the community is that, no matter
if we make a new version of BIP-0039, or a new BIP (with SSS support), or
to enhance SLIP-0039, we really need to address this language issue.

Here are what I propose:

1. The mnemonic code should be only a representation of underlying entropy
or (pre) master secret, seed, whatever. In this way, the same seed/secret
could be displayed in English or in Chinese or other languages. Then there
could be 3rd party conversion tools to support translations in case any
wallet software or device does not support all specified languages. Now it
looks like:

   Mnemonic Sentence <=> Entropy => PBKDF2 => BIP-0032 Seed

2. Given that only 8 languages are supported in BIP-0039, we should allow
the seed/secret to be represented in decimal numbers, each ranging from 0
to 2047. So those who cannot find a native language support yet having
difficulty coping words in other languages could choose to just use numbers.

So far I don't have a preference how this should be implemented. I'd like
to hear from community first.

Thanks,

Weiji Guo

[-- Attachment #2: Type: text/html, Size: 2228 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-11-22 17:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <mailman.2299.1542895684.19477.bitcoin-dev@lists.linuxfoundation.org>
2018-11-22 17:25 ` [bitcoin-dev] BIP- & SLIP-0039 -- better multi-language support Weiji Guo
     [not found] <CABsxsG27bJN0vGRJOP3=zriPvkL+G8n3t2nd6Y8L6KwW4ePdeg@mail.gmail.com>
2018-11-19 19:54 ` Steven Hatzakis
2018-11-20  1:51   ` Natanael
2018-11-08 10:37 SomberNight
2018-11-09  5:17 ` Jonathan Underwood
2018-11-16 14:03   ` Neill Miller
2018-11-16 14:05     ` Jonathan Underwood
2018-11-16 14:15       ` Neill Miller
     [not found] <mailman.1087.1541036387.10264.bitcoin-dev@lists.linuxfoundation.org>
2018-11-06 16:16 ` Weiji Guo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox