From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from sog-mx-4.v43.ch3.sourceforge.com ([172.29.43.194] helo=mx.sourceforge.net) by sfs-ml-3.v29.ch3.sourceforge.com with esmtp (Exim 4.76) (envelope-from ) id 1VXJpS-0005AZ-Ls for bitcoin-development@lists.sourceforge.net; Fri, 18 Oct 2013 23:51:38 +0000 Received-SPF: pass (sog-mx-4.v43.ch3.sourceforge.com: domain of gmail.com designates 209.85.192.175 as permitted sender) client-ip=209.85.192.175; envelope-from=jan.marecek@gmail.com; helo=mail-pd0-f175.google.com; Received: from mail-pd0-f175.google.com ([209.85.192.175]) by sog-mx-4.v43.ch3.sourceforge.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.76) id 1VXJpR-0004Jq-Qy for bitcoin-development@lists.sourceforge.net; Fri, 18 Oct 2013 23:51:38 +0000 Received: by mail-pd0-f175.google.com with SMTP id g10so4337808pdj.20 for ; Fri, 18 Oct 2013 16:51:32 -0700 (PDT) X-Received: by 10.66.152.102 with SMTP id ux6mr5761148pab.79.1382140291997; Fri, 18 Oct 2013 16:51:31 -0700 (PDT) Received: from myhost (ppp121-45-222-119.lns20.cbr1.internode.on.net. [121.45.222.119]) by mx.google.com with ESMTPSA id ta10sm7293382pab.5.2013.10.18.16.51.29 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 18 Oct 2013 16:51:31 -0700 (PDT) From: jan To: bitcoin-development@lists.sourceforge.net Date: Sat, 19 Oct 2013 10:52:58 +1100 Message-ID: <87iowuuof9.fsf@gmail.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Spam-Score: -1.6 (-) X-Spam-Report: Spam Filtering performed by mx.sourceforge.net. See http://spamassassin.org/tag/ for more details. -1.5 SPF_CHECK_PASS SPF reports sender host as permitted sender for sender-domain 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (jan.marecek[at]gmail.com) -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [209.85.192.175 listed in list.dnswl.org] 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: github.com] X-Headers-End: 1VXJpR-0004Jq-Qy Subject: [Bitcoin-development] BIP39 word list X-BeenThere: bitcoin-development@lists.sourceforge.net X-Mailman-Version: 2.1.9 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Oct 2013 23:51:38 -0000 The words 'public', 'private' and 'secret' could be confusing when encoding public and private keys. eg. a private key that begins with the word 'public'. I think avoiding words that could look similar when written down would be a good idea aswell. I searched for words that only differ by the letters c & e, g & y, u & v and found the following: car ear cat eat gear year value valve Other combinations could potentially be problematic depending on the handwriting style: ft, ao, ij, vy, possibly even lt and il? I've included the search utility I used below. #include #include #include char *similar_char_pairs[] = { "ce", "gy", "uv", NULL }; bool is_similar_char(char c1, char c2) { char **pairs = similar_char_pairs; do { char *p = *pairs; if ((c1 == p[0] && c2 == p[1]) || (c1 == p[1] && c2 == p[0])) return true; } while (*++pairs); return false; } bool print_words_if_similar(char *word1, char *word2) { /* reject words of different lengths */ if (strlen(word1) != strlen(word2)) return false; size_t i, similarcount = 0; for (i = 0; i < strlen(word1); i++) { /* skip identical letters */ if (word1[i] == word2[i]) continue; /* reject words that don't match */ if (is_similar_char(word1[i], word2[i]) == false) return false; similarcount++; } /* reject words with more than 1 different letter */ //if (similarcount > 1) // return false; printf("%s %s\n", word1, word2); return true; } int main(void) { /* english.txt is assumed to exist in the working directory download from: https://github.com/trezor/python-mnemonic/blob/master/mnemonic/wordlist/english.txt */ FILE* f = fopen("english.txt", "r"); if (!f) { fprintf(stderr, "failed to open english.txt\n"); return 1; } /* read in word list, assumes one word per line */ #define MAXWORD 16 char wordlist[2048][MAXWORD]; int word = 0; while (fgets(wordlist[word], MAXWORD, f)) { /* strip trailing whitespace, assumes no leading whitespace */ char *ch = strpbrk(wordlist[word], " \n\t"); if (ch) *ch = '\0'; word++; } if (word != 2048) { fprintf(stderr, "word list incorrect length\n"); return 1; } /* check each word for similarity against every other word */ int i, j, count = 0; for (i = 0; i < 2048; i++) { for (j = i+1; j < 2048; j++) { if (print_words_if_similar(wordlist[i], wordlist[j])) count++; } } printf("%d matches\n", count); return 0; }