On Thu, Oct 24, 2013 at 7:29 PM, <thomasV1@gmx.de> wrote:

My initial problem was that BIP0039 is not backward compatible with Electrum. When trying to solve that, I realized that the seed encoding used in Electrum does not help, because it does not contain a version number information. However, BIP0039 suffers the same shortcoming: it does nothing to help a future replacement, it wants to be final. My first recommendation is to allocate a few bits of the mnemonic, in order to encode a "version number" along with the checksum bits.


Two years ago I proposed exactly this and you refused to add extra information to mnemonic, because "it isn't necessary" and "it makes it longer to mnemonization". What changed since then?
 
The second problem is the wallet structure. There are multiple ways to use a BIP32 tree, and each client will certainly handle this differently. For Electrum, it is important to be able to recover an entire wallet from its mnemonic, using no extra information. Thus, the client needs to know which branches of the BIP32 tree are populated by default. This means that the "version number" I mentioned will not only be about the seed encoding, but it should also give some information about the wallet structure, at least in the case of Electrum.


Hm, what exactly do you need to store about wallet structure? I lived in opinion that everything is able to recover using CKD function to generate new addresses and blockchain lookups for their balances.
 
The third problem is the dictionary. I do not like the dictionary proposed in BIP0039, because it contains too many short words, which are bad for memorization (I explained here how I designed the dictionary used by Electrum: https://bitcointalk.org/index.php?topic=153990.msg2167909#msg2167909). I had some discussions with slush about this, but I do not think it will ever be possible to find a consensus on that topic.


Yes, that's true. It isn't possible to make everybody 100% happy. At least I wanted to be constructive and asked you to replace the most problematic words. No pull request from you so far.
 
BIP0039 also suggests to use localized dictionaries, with non-colliding word lists, but it is not clear how that will be achieved; it seems to be difficult, because languages often have words in common. It looks like a first-come-first-served aproach will be used.


Yes, it was original idea. So far I don't think this is a problem. Of course some words may have some meaning across languages, but it should be easy to avoid them. There are tens of thousands words in every language and we need to pick "only" 2048 words to wordlist.
 
For these reasons, I believe that we need a dictionary-independent solution. This will allow developers to use the dictionary they like, and localization will be easy. 
I would like to suggest the following solution:


If I understand this well, it is basically one-way algorithm "mnemonic -> seed", right? Seed cannot be printed out as mnemonic, because there's hashing involved, but the bi-directionality has been the original requirement for such algorithm (at least in Electrum and bip39).

Then, how is this different to picking 12 random words from dictionary and hashing them together? I don't see any benefit in that "mining" part of the proposal (except that it is lowering the entropy for given length of mnemonic).


This solution makes it possible for developers to define new dictionaries, localized or adapted to a particular need.

Are your worries about overlapping words across languages a real issue?
 
slush