public inbox for bitcoindev@googlegroups.com
 help / color / mirror / Atom feed
From: slush <slush@centrum.cz>
To: Gregory Maxwell <gmaxwell@gmail.com>
Cc: Bitcoin Development <bitcoin-development@lists.sourceforge.net>
Subject: Re: [Bitcoin-development] BIP39 word list
Date: Thu, 24 Oct 2013 15:26:32 +0200	[thread overview]
Message-ID: <CAJna-Hjap-GPc-rTQsYxVqjvigZ3y8YDQmL++b2Gw0SAC7QLQA@mail.gmail.com> (raw)
In-Reply-To: <CAAS2fgQ1uYvNxZu6DOKZ2k9qj2kYhGpHxxzsZdsqb-Oi2uSKmw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5883 bytes --]

I've just pushed updated wordlist which is filtered to similar characters
taken from this matrix.

BIP39 now consider following character pairs as similar:

        similar = (
            ('a', 'c'), ('a', 'e'), ('a', 'o'),
            ('b', 'd'), ('b', 'h'), ('b', 'p'), ('b', 'q'), ('b', 'r'),
            ('c', 'e'), ('c', 'g'), ('c', 'n'), ('c', 'o'), ('c', 'q'),
('c', 'u'),
            ('d', 'g'), ('d', 'h'), ('d', 'o'), ('d', 'p'), ('d', 'q'),
            ('e', 'f'), ('e', 'o'),
            ('f', 'i'), ('f', 'j'), ('f', 'l'), ('f', 'p'), ('f', 't'),
            ('g', 'j'), ('g', 'o'), ('g', 'p'), ('g', 'q'), ('g', 'y'),
            ('h', 'k'), ('h', 'l'), ('h', 'm'), ('h', 'n'), ('h', 'r'),
            ('i', 'j'), ('i', 'l'), ('i', 't'), ('i', 'y'),
            ('j', 'l'), ('j', 'p'), ('j', 'q'), ('j', 'y'),
            ('k', 'x'),
            ('l', 't'),
            ('m', 'n'), ('m', 'w'),
            ('n', 'u'), ('n', 'z'),
            ('o', 'p'), ('o', 'q'), ('o', 'u'), ('o', 'v'),
            ('p', 'q'), ('p', 'r'),
            ('q', 'y'),
            ('s', 'z'),
            ('u', 'v'), ('u', 'w'), ('u', 'y'),
            ('v', 'w'), ('v', 'y')
        )

Feel free to review and comment current wordlist, but I think we're slowly
moving forward final list.

slush


On Sat, Oct 19, 2013 at 1:58 AM, Gregory Maxwell <gmaxwell@gmail.com> wrote:

> some fairly old wordlist solver code of mine:
>
> https://people.xiph.org/~greg/wordlist.visual.py
>
> it has a 52x52 letter visual similarity matrix in it (along with a
> citation)
>
> On Fri, Oct 18, 2013 at 4:52 PM, jan <jan.marecek@gmail.com> wrote:
> >
> > The words 'public', 'private' and 'secret' could be confusing when
> > encoding public and private keys. eg. a private key that begins with
> > the word 'public'.
> >
> > I think avoiding words that could look similar when written down would
> > be a good idea aswell. I searched for words that only differ by the
> > letters c & e, g & y, u & v and found the following:
> >
> > car ear
> > cat eat
> > gear year
> > value valve
> >
> > Other combinations could potentially be problematic depending on the
> > handwriting style: ft, ao, ij, vy, possibly even lt and il?
> >
> > I've included the search utility I used below.
> >
> >
> > #include <stdbool.h>
> > #include <string.h>
> > #include <stdio.h>
> >
> > char *similar_char_pairs[] = { "ce", "gy", "uv", NULL };
> >
> > bool is_similar_char(char c1, char c2)
> > {
> >   char **pairs = similar_char_pairs;
> >   do {
> >     char *p = *pairs;
> >     if ((c1 == p[0] && c2 == p[1]) ||
> >         (c1 == p[1] && c2 == p[0]))
> >       return true;
> >   } while (*++pairs);
> >
> >   return false;
> > }
> >
> > bool print_words_if_similar(char *word1, char *word2)
> > {
> >   /* reject words of different lengths */
> >   if (strlen(word1) != strlen(word2))
> >     return false;
> >
> >   size_t i, similarcount = 0;
> >
> >   for (i = 0; i < strlen(word1); i++) {
> >     /* skip identical letters */
> >     if (word1[i] == word2[i])
> >       continue;
> >
> >     /* reject words that don't match */
> >     if (is_similar_char(word1[i], word2[i]) == false)
> >       return false;
> >
> >     similarcount++;
> >   }
> >
> >   /* reject words with more than 1 different letter */
> >   //if (similarcount > 1)
> >   //  return false;
> >
> >   printf("%s %s\n", word1, word2);
> >
> >   return true;
> > }
> >
> > int main(void)
> > {
> >   /* english.txt is assumed to exist in the working directory
> >      download from:
> >
> https://github.com/trezor/python-mnemonic/blob/master/mnemonic/wordlist/english.txt*/
> >   FILE* f = fopen("english.txt", "r");
> >   if (!f) {
> >     fprintf(stderr, "failed to open english.txt\n");
> >     return 1;
> >   }
> >
> >   /* read in word list, assumes one word per line */
> >   #define MAXWORD 16
> >   char wordlist[2048][MAXWORD];
> >   int word = 0;
> >   while (fgets(wordlist[word], MAXWORD, f)) {
> >     /* strip trailing whitespace, assumes no leading whitespace */
> >     char *ch = strpbrk(wordlist[word], " \n\t");
> >     if (ch)
> >       *ch = '\0';
> >     word++;
> >   }
> >
> >   if (word != 2048) {
> >     fprintf(stderr, "word list incorrect length\n");
> >     return 1;
> >   }
> >
> >   /* check each word for similarity against every other word */
> >   int i, j, count = 0;
> >   for (i = 0; i < 2048; i++) {
> >     for (j = i+1; j < 2048; j++) {
> >       if (print_words_if_similar(wordlist[i], wordlist[j]))
> >         count++;
> >     }
> >   }
> >
> >   printf("%d matches\n", count);
> >
> >   return 0;
> > }
> >
> >
> ------------------------------------------------------------------------------
> > October Webinars: Code for Performance
> > Free Intel webinars can help you accelerate application performance.
> > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> > the latest Intel processors and coprocessors. See abstracts and register
> >
> >
> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk
> > _______________________________________________
> > Bitcoin-development mailing list
> > Bitcoin-development@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>
>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
> from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>

[-- Attachment #2: Type: text/html, Size: 9390 bytes --]

  parent reply	other threads:[~2013-10-24 13:27 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-18 23:52 [Bitcoin-development] BIP39 word list jan
2013-10-18 23:58 ` Gregory Maxwell
2013-10-19 10:11   ` Pavol Rusnak
2013-10-24 13:26   ` slush [this message]
2013-10-23  0:56 ` slush
2013-11-01 20:14 Brooks Boyd
2013-11-01 23:41 ` Allen Piscitello
2013-11-02  0:04 ` slush
2013-11-02  4:31   ` Brooks Boyd

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJna-Hjap-GPc-rTQsYxVqjvigZ3y8YDQmL++b2Gw0SAC7QLQA@mail.gmail.com \
    --to=slush@centrum.cz \
    --cc=bitcoin-development@lists.sourceforge.net \
    --cc=gmaxwell@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox