* [bitcoin-dev] Codex32
@ 2023-02-16 2:16 Russell O'Connor
2023-02-16 11:50 ` Pavol Rusnak
2023-02-20 18:44 ` Andrew Poelstra
0 siblings, 2 replies; 15+ messages in thread
From: Russell O'Connor @ 2023-02-16 2:16 UTC (permalink / raw)
To: Bitcoin Protocol Discussion
[-- Attachment #1: Type: text/plain, Size: 29073 bytes --]
I've been asked by Dr. Curr and Professor Snead to forward this message to
this mailing list, as it may be of general interest to Bitcoin users.
Dear Colleague:
In 1967, during excavation for the construction of a new shopping center in
Monroeville, Pennsylvania, workers uncovered a vault containing a cache of
ancient scrolls[1]. Most were severely damaged, but those that could be
recovered confirmed the existence of a secret society long suspected to
have
been active in the region around the year 200 BC.
Based on a translation of these documents, we now know that the society,
the
Cult of the Bound Variable, was devoted to the careful study of
computation,
over two millennia before the invention of the digital computer.
While the Monroeville scrolls make reference to computing machines made of
sandstone, most researchers believed this to be a poetic metaphor and that
the
"computers" were in fact the initiates themselves, carrying out the
unimaginably tedious steps of their computations with reed pens on
parchment.
Within the vault, a collection of sandstone wheels marked in a language
consisting of 32 glyphs was found. After 15 years of study, we have
successfully
completed the translation of what is known as "Codex32," a document that
describes the functions of the wheels. It was discovered that the wheels
operate
a system of cryptographic computations that was used by cult members to
safeguard their most valuable secrets.
The Codex32 system allows secrets to be carved into multiple tablets and
scattered to the far corners of the earth. When a sufficient number of
tablets are
brought together the stone wheels are manipulated in a manner to recover the
secrets. This finding may be of particular interest to the Bitcoin
community.
Below we provide a summary of the cult's secret sharing system, which is
graciously hosted at
<
https://github.com/apoelstra/bips/blob/2023-02--volvelles/bip-0000.mediawiki
>.
We are requesting a record assignment in the Bibliography of Immemorial
Philosophy (BIP) repository.
Thank you for your consideration.
Dr. Leon O. Curr and Professor Pearlwort Snead
Department of Archaeocryptography
Harry Q. Bovik Institute for the Advancement
[1] http://www.boundvariable.org/task.shtml
-----BEGIN BIP-----
<pre>
BIP: ????
Layer: Applications
Title: codex32
Author: Leon Olsson Curr and Pearlwort Sneed <pearlwort@wpsoftware.net>
Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-????
Status: Draft
Type: ????
Created: 2023-02-13
License: BSD-3-Clause
Post-History: FIXME
</pre>
==Introduction==
===Abstract===
This document describes a standard for backing up and restoring the master
seed of a
[https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki BIP-0032]
hierarchical deterministic wallet, using Shamir's secret sharing.
It includes an encoding format, a BCH error-correcting checksum, and
algorithms for share generation and secret recovery.
Secret data can be split into up to 31 shares.
A minimum threshold of shares, which can be between 1 and 9, is needed to
recover the secret, whereas without sufficient shares, no information about
the secret is recoverable.
===Copyright===
This document is licensed under the 3-clause BSD license.
===Motivation===
BIP-0032 master seed data is the source entropy used to derive all private
keys in an HD wallet.
Safely storing this secret data is the hardest and most important part of
self-custody.
However, there is a tension between security, which demands limiting the
number of backups, and resilience, which demands widely replicated backups.
Encrypting the seed does not change this fundamental tradeoff, since it
leaves essentially the same problem of how to back up the encryption key(s).
To allow users freedom to make this tradeoff, we use Shamir's secret
sharing, which guarantees that any number of shares less than the threshold
leaks no information about the secret.
This approach allows increasing safety by widely distributing the generated
shares, while also providing security against the compromise of one or more
shares (as long as fewer than the threshold have been compromised).
[https://github.com/satoshilabs/slips/blob/master/slip-0039.md SLIP-0039]
has essentially the same motivations as this standard.
However, unlike SLIP-0039, this standard also aims to be simple enough for
hand computation.
Users who demand a higher level of security for particular secrets, or have
a general distrust in digital electronic devices, have the option of using
hand computation to backup and restore secret data in an interoperable
manner.
Note that hand computation is optional, the particular details of hand
computation are outside the scope of this standard, and implementers do not
need to be concerned with this possibility.
[https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki BIP-0039]
serves the same purpose as this standard: encoding master seeds for storage
by users.
However, BIP-0039 has no error-correcting ability, cannot sensibly be
extended to support secret sharing, has no support for versioning or other
metadata, and has many technical design decisions that make implementation
and interoperability difficult (for example, the use of SHA-512 to derive
seeds, or the use of 11-bit words).
==Specification==
===codex32===
A codex32 string is similar to a Bech32 string defined in [
https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki BIP-0173].
It reuses the base32 character set from BIP-0173, and consists of:
* A human-readable part, which is the string "ms" (or "MS").
* A separator, which is always "1".
* A data part which is in turn subdivided into:
** A threshold parameter, which MUST be a single digit between "2" and "9",
or the digit "0".
*** If the threshold parameter is "0" then the share index, defined below,
MUST have a value of "s" (or "S").
** An identifier consisting of 4 Bech32 characters.
** A share index, which is any Bech32 character. Note that a share index
value of "s" (or "S") is special and denotes the unshared secret (see
section "Unshared Secret").
** A payload which is a sequence of up to 74 Bech32 characters. (However,
see '''Long codex32 Strings''' below for an exception to this limit.)
** A checksum which consists of 13 Bech32 characters as described below.
As with Bech32 strings, a codex32 string MUST be entirely uppercase or
entirely lowercase.
The lowercase form is used when determining a character's value for
checksum purposes.
For presentation, lowercase is usually preferable, but uppercase SHOULD be
used for handwritten codex32 strings.
===Checksum===
The last thirteen characters of the data part form a checksum and contain
no information.
Valid strings MUST pass the criteria for validity specified by the Python3
code snippet below.
The function <code>ms32_verify_checksum</code> must return true when its
argument is the data part as a list of integers representing the characters
converted using the bech32 character table from BIP-0173.
To construct a valid checksum given the data-part characters (excluding the
checksum), the <code>ms32_create_checksum</code> function can be used.
<source lang="python">
MS32_CONST = 0x10ce0795c2fd1e62a
def ms32_polymod(values):
GEN = [
0x19dc500ce73fde210,
0x1bfae00def77fe529,
0x1fbd920fffe7bee52,
0x1739640bdeee3fdad,
0x07729a039cfc75f5a,
]
residue = 0x23181b3
for v in values:
b = (residue >> 60)
residue = (residue & 0x0fffffffffffffff) << 5 ^ v
for i in range(5):
residue ^= GEN[i] if ((b >> i) & 1) else 0
return residue
def ms32_verify_checksum(data):
if len(data) >= 96: # See Long codex32 Strings
return ms32_verify_long_checksum(data)
if len(data) <= 93:
return ms32_polymod(data) == MS32_CONST
return False
def ms32_create_checksum(data):
if len(data) > 80: # See Long codex32 Strings
return ms32_create_long_checksum(data)
values = data
polymod = ms32_polymod(values + [0] * 13) ^ MS32_CONST
return [(polymod >> 5 * (12 - i)) & 31 for i in range(13)]
</source>
===Error Correction===
A codex32 string without a valid checksum MUST NOT be used.
The checksum is designed to be an error correcting code that can correct up
to 4 character substitutions, up to 8 unreadable characters (called
erasures), or up to 13 consecutive erasures.
Implementations SHOULD provide the user with a corrected valid codex32
string if possible.
However, implementations SHOULD NOT automatically proceed with a corrected
codex32 string without user confirmation of the corrected string, either by
prompting the user, or returning a corrected string in an error message and
allowing the user to repeat their action.
We do not specify how an implementation should implement error correction.
However, we recommend that:
* Implementations make suggestions to substitute non-bech32 characters with
bech32 characters in some situations, such as replacing "B" with "8", "O"
with "0", "I" with "l", etc.
* Implementations interpret "?" as an erasure.
* Implementations optionally interpret other non-bech32 characters, or
characters with incorrect case, as erasures.
* If a string with 8 or fewer erasures can have those erasures filled in to
make a valid codex32 string, then the implementation suggests such a string
as a correction.
* If a string consisting of valid Bech32 characters in the proper case can
be made valid by substituting 4 or fewer characters, then the
implementation suggests such a string as a correction.
===Unshared Secret===
When the share index of a valid codex32 string (converted to lowercase) is
the letter "s", we call the string a codex32 secret.
The subsequent data characters in a codex32 secret, excluding the final
checksum of 13 characters, is a direct encoding of a BIP-0032 HD master
seed.
The master seed is decoded by converting the data to bytes:
* Translate the characters to 5 bits values using the bech32 character
table from BIP-0173, most significant bit first.
* Re-arrange those bits into groups of 8 bits. Any incomplete group at the
end MUST be 4 bits or less, and is discarded.
Note that unlike the decoding process in BIP-0173, we do NOT require that
the incomplete group be all zeros.
For an unshared secret, the threshold parameter (the first character of the
data part) is ignored (beyond the fact it must be a digit for the codex32
string to be valid).
We recommend using the digit "0" for the threshold parameter in this case.
The 4 character identifier also has no effect beyond aiding users in
distinguishing between multiple different master seeds in cases where they
have more than one.
===Recovering Master Seed===
When the share index of a valid codex32 string (converted to lowercase) is
not the letter "s", we call the string an codex32 share.
The first character of the data part indicates the threshold of the share,
and it is required to be a non-"0" digit.
In order to recover a master seed, one needs a set of valid codex32 shares
such that:
* All shares have the same threshold value, the same identifier, and the
same length.
* All of the share index values are distinct.
* The number of codex32 shares is exactly equal to the (common) threshold
value.
If all the above conditions are satisfied, the <code>ms32_recover</code>
function will return a codex32 secret when its argument is the list of
codex32 shares with each share represented as a list of integers
representing the characters converted using the bech32 character table from
BIP-0173.
<source lang="python">
bech32_inv = [
0, 1, 20, 24, 10, 8, 12, 29, 5, 11, 4, 9, 6, 28, 26, 31,
22, 18, 17, 23, 2, 25, 16, 19, 3, 21, 14, 30, 13, 7, 27, 15,
]
def bech32_mul(a, b):
res = 0
for i in range(5):
res ^= a if ((b >> i) & 1) else 0
a *= 2
a ^= 41 if (32 <= a) else 0
return res
def bech32_lagrange(l, x):
n = 1
c = []
for i in l:
n = bech32_mul(n, i ^ x)
m = 1
for j in l:
m = bech32_mul(m, (x if i == j else i) ^ j)
c.append(m)
return [bech32_mul(n, bech32_inv[i]) for i in c]
def ms32_interpolate(l, x):
w = bech32_lagrange([s[5] for s in l], x)
res = []
for i in range(len(l[0])):
n = 0
for j in range(len(l)):
n ^= bech32_mul(w[j], l[j][i])
res.append(n)
return res
def ms32_recover(l):
return ms32_interpolate(l, 16)
</source>
===Generating Shares===
If we already have ''t'' valid codex32 strings such that:
* All strings have the same threshold value ''t'', the same identifier, and
the same length
* All of the share index values are distinct
Then we can derive additional shares with the <code>ms32_interpolate</code>
function by passing it a list of exactly ''t'' of these codex32 strings,
together with a fresh share index distinct from all of the existing share
indexes.
The newly derived share will have the provided share index.
Once a user has generated ''n'' codex32 shares, they may discard the
codex32 secret (if it exists).
The ''n'' shares form a ''t'' of ''n'' Shamir's secret sharing scheme of a
codex32 secret.
There are two ways to create an initial set of ''t'' valid codex32 strings,
depending on whether the user already has an existing master seed to split.
====For an existing master seed====
Before generating shares for an existing master seed, it first must be
converted into a codex32 secret, as described above.
The conversion process consists of:
* Choosing a threshold value ''t'' between 2 and 9, inclusive
* Choosing a 4 bech32 character identifier
** We do not define how to choose the identifier, beyond noting that it
SHOULD be distinct for every master seed the user may need to disambiguate.
* Setting the share index to "s"
* Setting the payload to a Bech32 encoding of the master seed, padded with
arbitrary bits
* Generating a valid checksum in accordance with the Checksum section
Along with the codex32 secret, the user must generate ''t''-1 other codex32
shares, each with the same threshold value, the same identifier, and a
distinct share index.
The set of share indexes may be chosen arbitrarily.
The payload of each of these codex32 shares is chosen uniformly at random
such that it has the same length as the payload of the codex32 secret.
For each share, a valid checksum must be generated in accordance with the
Checksum section.
The codex32 secret and the ''t''-1 codex32 shares form a set of ''t'' valid
codex32 strings from which additional shares can be derived as described
above.
====For a fresh master seed====
In the case that the user wishes to generate a fresh master seed, the user
chooses a threshold value ''t'' and an identifier, then generates ''t''
random codex32 shares, using the generation procedure from the previous
section.
As before, each share must have the same threshold value ''t'', the same
identifier, and a distinct share index.
With this set of ''t'' codex32 shares, new shares can be derived as
discussed above. This process generates a fresh master seed, whose value
can be retrieved by running the recovery process on any ''t'' of these
shares.
===Long codex32 Strings===
The 13 character checksum design only supports up to 80 data characters.
Excluding the threshold, identifier and index characters, this limits the
payload to 74 characters or 46 bytes.
While this is enough to support the 32-byte advised size of BIP-0032 master
seeds, BIP-0032 allows seeds to be up to 64 bytes in size.
We define a long codex32 string format to support these longer seeds by
defining an alternative checksum.
<source lang="python">
MS32_LONG_CONST = 0x43381e570bf4798ab26
def ms32_long_polymod(values):
GEN = [
0x3d59d273535ea62d897,
0x7a9becb6361c6c51507,
0x543f9b7e6c38d8a2a0e,
0x0c577eaeccf1990d13c,
0x1887f74f8dc71b10651,
]
residue = 0x23181b3
for v in values:
b = (residue >> 70)
residue = (residue & 0x3fffffffffffffffff) << 5 ^ v
for i in range(5):
residue ^= GEN[i] if ((b >> i) & 1) else 0
return residue
def ms32_verify_long_checksum(data):
return ms32_long_polymod(data) == MS32_LONG_CONST
def ms32_create_long_checksum(data):
values = data
polymod = ms32_long_polymod(values + [0] * 15) ^ MS32_LONG_CONST
return [(polymod >> 5 * (14 - i)) & 31 for i in range(15)]
</source>
A long codex32 string follows the same specification as a regular codex32
string with the following changes.
* The payload is a sequence of between 75 and 103 Bech32 characters.
* The checksum consists of 15 Bech32 characters as defined above.
A codex32 string with a data part of 94 or 95 characters is never legal as
a regular codex32 string is limited to 93 data characters and a long
codex32 string is at least 96 characters.
Generation of long shares and recovery of the master seed from long shares
proceeds in exactly the same way as for regular shares with the
<code>ms32_interpolate</code> function.
The long checksum is designed to be an error correcting code that can
correct up to 4 character substitutions, up to 8 unreadable characters
(called erasures), or up to 15 consecutive erasures.
As with regular checksums we do not specify how an implementation should
implement error correction, and all our recommendations for error
correction of regular codex32 strings also apply to long codex32 strings.
==Rationale==
This scheme is based on the observation that the Lagrange interpolation of
valid codewords in a BCH code will always be a valid codeword.
This means that derived shares will always have valid checksum, and a
sufficient threshold of shares with valid checksums will derive a secret
with a valid checksum.
The header system is also compatible with Lagrange interpolation, meaning
all derived shares will have the same identifier and will have the
appropriate share index.
This fact allows the header data to be covered by the checksum.
The checksum size and identifier size have been chosen so that the encoding
of 128-bit seeds and shares fit within 48 characters.
This is a standard size for many common seed storage formats, which has
been popularized by the 12 four-letter word format of the BIP-0039 mnemonic.
The 13 character checksum is adequate to correct 4 errors in up to 93
characters (80 characters of data and 13 characters of the checksum). This
is somewhat better quality than the checksum used in SLIP-0039.
For 256-bit seeds and shares our strings are 74 characters, which fits into
the 96 character format of the 24 four-letter word format of the BIP-0039
mnemonic, with plenty of room to spare.
A longer checksum is needed to support up to 512-bit seeds, the longest
seed length specified in BIP-0032, as the 13 character checksum isn't
adequate for more than 80 data characters.
While we could use the 15 character checksum for both cases, we prefer to
keep the strings as short as possible for the more common cases of 128-bit
and 256-bit master seeds.
We only guarantee to correct 4 characters no matter how long the string is.
Longer strings mean more chances for transcription errors, so shorter
strings are better.
The longest data part using the regular 13 character checksum is 93
characters and corresponds to a 400-bit secret.
At this length, the prefix <code>MS1</code> is not covered by the checksum.
This is acceptable because the checksum scheme itself requires you to know
that the <code>MS1</code> prefix is being used in the first place.
If the prefix is damaged and a user is guessing that the data might be
using this scheme, then the user can enter the available data explicitly
using the suspected <code>MS1</code> prefix.
==Backwards Compatibility==
codex32 is an alternative to BIP-0039 and SLIP-0039.
It is technically possible to derive the BIP32 master seed from seed words
encoded in one of these schemes, and then to encode this seed in codex32.
For BIP-0039 this process is irreversible, since it involves hashing the
original words.
Furthermore, the resulting seed will be 512 bits long, which may be too
large to be safely and conveniently handled.
SLIP-0039 seed words can be reversibly converted to master seeds, so it is
possible to interconvert between SLIP-0039 and codex32.
However, SLIP-0039 '''shares''' cannot be converted to codex32 shares
because the two schemes use a different underlying field.
The authors of this BIP do not recommend interconversion.
Instead, users who wish to switch to codex32 should generate a fresh seed
and sweep their coins.
==Reference Implementation==
* [https://secretcodex32.com/docs/2023-02-14--bw.ps Reference PostScript
Implementation]
* FIXME add Python implementation
* FIXME add Rust implementation
==Test Vectors==
===Test vector 1===
This example shows the codex32 format, when used without splitting the
secret into any shares.
The data part contains 26 Bech32 characters, which corresponds to 130 bits.
We truncate the last two bits in order to obtain a 128-bit master seed.
codex32 secret (Bech32):
<code>ms10testsxxxxxxxxxxxxxxxxxxxxxxxxxx4nzvca9cmczlw</code>
Master secret (hex): <code>318c6318c6318c6318c6318c6318c631</code>
* human-readable part: <code>ms</code>
* separator: <code>1</code>
* k value: <code>0</code> (no secret splitting)
* identifier: <code>test</code>
* share index: <code>s</code> (the secret)
* data: <code>xxxxxxxxxxxxxxxxxxxxxxxxxx</code>
* checksum: <code>4nzvca9cmczlw</code>
===Test vector 2===
This example shows generating a new master seed using "random" codex32
shares, as well as deriving an additional codex32 share, using ''k''=2 and
an identifier of <code>NAME</code>.
Although codex32 strings are canonically all lowercase, it's also valid to
use all uppercase.
Share with index <code>A</code>:
<code>MS12NAMEA320ZYXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM</code>
Share with index <code>C</code>:
<code>MS12NAMECACDEFGHJKLMNPQRSTUVWXYZ023FTR2GDZMPY6PN</code>
* Derived share with index <code>D</code>:
<code>MS12NAMEDLL4F8JLH4E5VDVULDLFXU2JHDNLSM97XVENRXEG</code>
* Secret share with index <code>S</code>:
<code>MS12NAMES6XQGUZTTXKEQNJSJZV4JV3NZ5K3KWGSPHUH6EVW</code>
* Master secret (hex): <code>d1808e096b35b209ca12132b264662a5</code>
Note that per BIP-0173, the lowercase form is used when determining a
character's value for checksum purposes.
In particular, given an all uppercase codex32 string, we still use
lowercase <code>ms</code> as the human-readable part during checksum
construction.
===Test vector 3===
This example shows splitting an existing 128-bit master seed into "random"
codex32 shares, using ''k''=3 and an identifier of <code>cash</code>.
We appended two zero bits in order to obtain 26 Bech32 characters (130 bits
of data) from the 128-bit master seed.
Master secret (hex): <code>ffeeddccbbaa99887766554433221100</code>
Secret share with index <code>s</code>:
<code>ms13cashsllhdmn9m42vcsamx24zrxgs3qqjzqud4m0d6nln</code>
Share with index <code>a</code>:
<code>ms13casha320zyxwvutsrqpnmlkjhgfedca2a8d0zehn8a0t</code>
Share with index <code>c</code>:
<code>ms13cashcacdefghjklmnpqrstuvwxyz023949xq35my48dr</code>
* Derived share with index <code>d</code>:
<code>ms13cashd0wsedstcdcts64cd7wvy4m90lm28w4ffupqs7rm</code>
* Derived share with index <code>e</code>:
<code>ms13casheekgpemxzshcrmqhaydlp6yhms3ws7320xyxsar9</code>
* Derived share with index <code>f</code>:
<code>ms13cashf8jh6sdrkpyrsp5ut94pj8ktehhw2hfvyrj48704</code>
Any three of the five shares among <code>acdef</code> can be used to
recover the secret.
Note that the choice to append two zero bits was arbitrary, and any of the
following four secret shares would have been valid choices.
However, each choice would have resulted in a different set of derived
shares.
* <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qqjzqud4m0d6nln</code>
* <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qpte35dvzkjpt0r</code>
* <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qzfatvdwq5692k6</code>
* <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qrsx6ydhed97jx2</code>
===Test vector 4===
This example shows converting a 256-bit secret into a codex32 secret,
without splitting the secret into any shares.
We appended four zero bits in order to obtain 52 Bech32 characters (260
bits of data) from the 256-bit secret.
256-bit secret (hex):
<code>ffeeddccbbaa99887766554433221100ffeeddccbbaa99887766554433221100</code>
* codex32 secret:
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqqtum9pgv99ycma</code>
Note that the choice to append four zero bits was arbitrary, and any of the
following sixteen codex32 secrets would have been valid:
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqqtum9pgv99ycma</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqpj82dp34u6lqtd</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqzsrs4pnh7jmpj5</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqrfcpap2w8dqezy</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqy5tdvphn6znrf0</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq9dsuypw2ragmel</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqx05xupvgp4v6qx</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq8k0h5p43c2hzsk</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqgum7hplmjtr8ks</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqf9q0lpxzt5clxq</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq28y48pyqfuu7le</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqt7ly0paesr8x0f</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqvrvg7pqydv5uyz</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqd6hekpea5n0y5j</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqwcnrwpmlkmt9dt</code>
*
<code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq0pgjxpzx0ysaam</code>
===Test vector 5===
This example shows generating a new 512-bit master seed using "random"
codex32 characters and appending a checksum.
The payload contains 103 Bech32 characters, which corresponds to 515 bits.
The last three bits are discarded when converting to a 512-bit master seed.
This is an example of a '''Long codex32 String'''.
* Secret share with index <code>S</code>:
<code>MS100C8VSM32ZXFGUHPCHTLUPZRY9X8GF2TVDW0S3JN54KHCE6MUA7LQPZYGSFJD6AN074RXVCEMLH8WU3TK925ACDEFGHJKLMNPQRSTUVWXY06FHPV80UNDVARHRAK</code>
* Master secret (hex):
<code>dc5423251cb87175ff8110c8531d0952d8d73e1194e95b5f19d6f9df7c01111104c9baecdfea8cccc677fb9ddc8aec5553b86e528bcadfdcc201c17c638c47e9</code>
==Appendix==
===Mathematical Companion===
Below we use the Bech32 character set to denote values in GF[32].
In Bech32, the letter <code>Q</code> denotes zero and the letter
<code>P</code> denotes one.
The digits <code>0</code> and <code>2</code> through <code>9</code> do
''not'' denote their numeric values.
They are simply elements of GF[32].
The generating polynomial for our BCH code is as follows.
We extend GF[32] to GF[1024] by adjoining a primitive cube root of unity,
<code>ζ</code>, satisfying <code>ζ^2 = ζ + P</code>.
We select <code>β := G ζ</code> which has order 93, and construct the
product <code>(x - β^i)</code> for <code>i</code> in <code>{17, 20, 46, 49,
52, 77, 78, 79, 80, 81, 82, 83, 84}</code>.
The resulting polynomial is our generating polynomial for our 13 character
checksum:
x^13 + E x^12 + M x^11 + 3 x^10 + G x^9 + Q x^8 + E x^7 + E x^6 + E x^5
+ L x^4 + M x^3 + C x^2 + S x + S
For our long checksum, we select <code>γ := E + X ζ</code>, which has order
1023, and construct the product <code>(x - γ^i)</code> for <code>i</code>
in <code>{32, 64, 96, 895, 927, 959, 991, 1019, 1020, 1021, 1022, 1023,
1024, 1025, 1026}</code>.
The resulting polynomial is our generating polynomial for our 15 character
checksum for long strings:
x^15 + 0 x^14 + 2 x^13 + E x^12 + 6 x^11 + F x^10 + E x^9 + 4 x^8 + X
x^7 + H x^6 + 4 x^5 + X x^4 + 9 x^3 + K x^2 + Y x^1 + H
(Reminder: the character <code>0</code> does ''not'' denote the zero of the
field.)
-----END BIP-----
[-- Attachment #2: Type: text/html, Size: 91685 bytes --]
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-16 2:16 [bitcoin-dev] Codex32 Russell O'Connor @ 2023-02-16 11:50 ` Pavol Rusnak 2023-02-16 13:49 ` Andrew Poelstra 2023-02-20 18:44 ` Andrew Poelstra 1 sibling, 1 reply; 15+ messages in thread From: Pavol Rusnak @ 2023-02-16 11:50 UTC (permalink / raw) To: Russell O'Connor, Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 31670 bytes --] Hi! The BIP states that its only advantage over SLIP-0039, which has been used in production for nearly three years (in at at least 3 SW/HW wallet implementations), is that it aims to be simple enough for hand computation. However, the BIP also indicates that "details of hand computation are outside the scope of this standard, and implementers do not need to be concerned with this possibility." Therefore, I am curious about how significant this advantage over SLIP-0039 really is. If hand computation is not straightforward and there are no other substantial advantages over SLIP-0039, I cannot help but feel that this BIP is simply a result of not-invented-here syndrome, but please correct me if I am wrong. Keep in mind that the encoded shares in SLIP-0039 consist of exactly 200 or 330 bits, both of which are divisible by 5. This makes it straightforward to encode them as Bech32 strings. On Thu, 16 Feb 2023 at 09:30, Russell O'Connor via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > I've been asked by Dr. Curr and Professor Snead to forward this message to > this mailing list, as it may be of general interest to Bitcoin users. > > Dear Colleague: > > In 1967, during excavation for the construction of a new shopping center > in > Monroeville, Pennsylvania, workers uncovered a vault containing a cache of > ancient scrolls[1]. Most were severely damaged, but those that could be > recovered confirmed the existence of a secret society long suspected to > have > been active in the region around the year 200 BC. > > Based on a translation of these documents, we now know that the society, > the > Cult of the Bound Variable, was devoted to the careful study of > computation, > over two millennia before the invention of the digital computer. > > While the Monroeville scrolls make reference to computing machines made of > sandstone, most researchers believed this to be a poetic metaphor and that > the > "computers" were in fact the initiates themselves, carrying out the > unimaginably tedious steps of their computations with reed pens on > parchment. > > Within the vault, a collection of sandstone wheels marked in a language > consisting of 32 glyphs was found. After 15 years of study, we have > successfully > completed the translation of what is known as "Codex32," a document that > describes the functions of the wheels. It was discovered that the wheels > operate > a system of cryptographic computations that was used by cult members to > safeguard their most valuable secrets. > > The Codex32 system allows secrets to be carved into multiple tablets and > scattered to the far corners of the earth. When a sufficient number of > tablets are > brought together the stone wheels are manipulated in a manner to recover > the > secrets. This finding may be of particular interest to the Bitcoin > community. > > Below we provide a summary of the cult's secret sharing system, which is > graciously hosted at > < > https://github.com/apoelstra/bips/blob/2023-02--volvelles/bip-0000.mediawiki > >. > We are requesting a record assignment in the Bibliography of Immemorial > Philosophy (BIP) repository. > > Thank you for your consideration. > > Dr. Leon O. Curr and Professor Pearlwort Snead > Department of Archaeocryptography > Harry Q. Bovik Institute for the Advancement > > [1] http://www.boundvariable.org/task.shtml > > -----BEGIN BIP----- > > <pre> > BIP: ???? > Layer: Applications > Title: codex32 > Author: Leon Olsson Curr and Pearlwort Sneed <pearlwort@wpsoftware.net> > Comments-URI: https://github.com/bitcoin/bips/wiki/Comments:BIP-???? > Status: Draft > Type: ???? > Created: 2023-02-13 > License: BSD-3-Clause > Post-History: FIXME > </pre> > > ==Introduction== > > ===Abstract=== > > This document describes a standard for backing up and restoring the master > seed of a > [https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki BIP-0032] > hierarchical deterministic wallet, using Shamir's secret sharing. > It includes an encoding format, a BCH error-correcting checksum, and > algorithms for share generation and secret recovery. > Secret data can be split into up to 31 shares. > A minimum threshold of shares, which can be between 1 and 9, is needed to > recover the secret, whereas without sufficient shares, no information about > the secret is recoverable. > > ===Copyright=== > > This document is licensed under the 3-clause BSD license. > > ===Motivation=== > > BIP-0032 master seed data is the source entropy used to derive all private > keys in an HD wallet. > Safely storing this secret data is the hardest and most important part of > self-custody. > However, there is a tension between security, which demands limiting the > number of backups, and resilience, which demands widely replicated backups. > Encrypting the seed does not change this fundamental tradeoff, since it > leaves essentially the same problem of how to back up the encryption key(s). > > To allow users freedom to make this tradeoff, we use Shamir's secret > sharing, which guarantees that any number of shares less than the threshold > leaks no information about the secret. > This approach allows increasing safety by widely distributing the > generated shares, while also providing security against the compromise of > one or more shares (as long as fewer than the threshold have been > compromised). > > [https://github.com/satoshilabs/slips/blob/master/slip-0039.md SLIP-0039] > has essentially the same motivations as this standard. > However, unlike SLIP-0039, this standard also aims to be simple enough for > hand computation. > Users who demand a higher level of security for particular secrets, or > have a general distrust in digital electronic devices, have the option of > using hand computation to backup and restore secret data in an > interoperable manner. > Note that hand computation is optional, the particular details of hand > computation are outside the scope of this standard, and implementers do not > need to be concerned with this possibility. > > [https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki BIP-0039] > serves the same purpose as this standard: encoding master seeds for storage > by users. > However, BIP-0039 has no error-correcting ability, cannot sensibly be > extended to support secret sharing, has no support for versioning or other > metadata, and has many technical design decisions that make implementation > and interoperability difficult (for example, the use of SHA-512 to derive > seeds, or the use of 11-bit words). > > ==Specification== > > ===codex32=== > > A codex32 string is similar to a Bech32 string defined in [ > https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki BIP-0173]. > It reuses the base32 character set from BIP-0173, and consists of: > > * A human-readable part, which is the string "ms" (or "MS"). > * A separator, which is always "1". > * A data part which is in turn subdivided into: > ** A threshold parameter, which MUST be a single digit between "2" and > "9", or the digit "0". > *** If the threshold parameter is "0" then the share index, defined below, > MUST have a value of "s" (or "S"). > ** An identifier consisting of 4 Bech32 characters. > ** A share index, which is any Bech32 character. Note that a share index > value of "s" (or "S") is special and denotes the unshared secret (see > section "Unshared Secret"). > ** A payload which is a sequence of up to 74 Bech32 characters. (However, > see '''Long codex32 Strings''' below for an exception to this limit.) > ** A checksum which consists of 13 Bech32 characters as described below. > > As with Bech32 strings, a codex32 string MUST be entirely uppercase or > entirely lowercase. > The lowercase form is used when determining a character's value for > checksum purposes. > For presentation, lowercase is usually preferable, but uppercase SHOULD be > used for handwritten codex32 strings. > > ===Checksum=== > > The last thirteen characters of the data part form a checksum and contain > no information. > Valid strings MUST pass the criteria for validity specified by the Python3 > code snippet below. > The function <code>ms32_verify_checksum</code> must return true when its > argument is the data part as a list of integers representing the characters > converted using the bech32 character table from BIP-0173. > > To construct a valid checksum given the data-part characters (excluding > the checksum), the <code>ms32_create_checksum</code> function can be used. > > <source lang="python"> > MS32_CONST = 0x10ce0795c2fd1e62a > > def ms32_polymod(values): > GEN = [ > 0x19dc500ce73fde210, > 0x1bfae00def77fe529, > 0x1fbd920fffe7bee52, > 0x1739640bdeee3fdad, > 0x07729a039cfc75f5a, > ] > residue = 0x23181b3 > for v in values: > b = (residue >> 60) > residue = (residue & 0x0fffffffffffffff) << 5 ^ v > for i in range(5): > residue ^= GEN[i] if ((b >> i) & 1) else 0 > return residue > > def ms32_verify_checksum(data): > if len(data) >= 96: # See Long codex32 Strings > return ms32_verify_long_checksum(data) > if len(data) <= 93: > return ms32_polymod(data) == MS32_CONST > return False > > def ms32_create_checksum(data): > if len(data) > 80: # See Long codex32 Strings > return ms32_create_long_checksum(data) > values = data > polymod = ms32_polymod(values + [0] * 13) ^ MS32_CONST > return [(polymod >> 5 * (12 - i)) & 31 for i in range(13)] > </source> > > ===Error Correction=== > > A codex32 string without a valid checksum MUST NOT be used. > The checksum is designed to be an error correcting code that can correct > up to 4 character substitutions, up to 8 unreadable characters (called > erasures), or up to 13 consecutive erasures. > Implementations SHOULD provide the user with a corrected valid codex32 > string if possible. > However, implementations SHOULD NOT automatically proceed with a corrected > codex32 string without user confirmation of the corrected string, either by > prompting the user, or returning a corrected string in an error message and > allowing the user to repeat their action. > We do not specify how an implementation should implement error correction. > However, we recommend that: > > * Implementations make suggestions to substitute non-bech32 characters > with bech32 characters in some situations, such as replacing "B" with "8", > "O" with "0", "I" with "l", etc. > * Implementations interpret "?" as an erasure. > * Implementations optionally interpret other non-bech32 characters, or > characters with incorrect case, as erasures. > * If a string with 8 or fewer erasures can have those erasures filled in > to make a valid codex32 string, then the implementation suggests such a > string as a correction. > * If a string consisting of valid Bech32 characters in the proper case can > be made valid by substituting 4 or fewer characters, then the > implementation suggests such a string as a correction. > > ===Unshared Secret=== > > When the share index of a valid codex32 string (converted to lowercase) is > the letter "s", we call the string a codex32 secret. > The subsequent data characters in a codex32 secret, excluding the final > checksum of 13 characters, is a direct encoding of a BIP-0032 HD master > seed. > > The master seed is decoded by converting the data to bytes: > > * Translate the characters to 5 bits values using the bech32 character > table from BIP-0173, most significant bit first. > * Re-arrange those bits into groups of 8 bits. Any incomplete group at the > end MUST be 4 bits or less, and is discarded. > > Note that unlike the decoding process in BIP-0173, we do NOT require that > the incomplete group be all zeros. > > For an unshared secret, the threshold parameter (the first character of > the data part) is ignored (beyond the fact it must be a digit for the > codex32 string to be valid). > We recommend using the digit "0" for the threshold parameter in this case. > The 4 character identifier also has no effect beyond aiding users in > distinguishing between multiple different master seeds in cases where they > have more than one. > > ===Recovering Master Seed=== > > When the share index of a valid codex32 string (converted to lowercase) is > not the letter "s", we call the string an codex32 share. > The first character of the data part indicates the threshold of the share, > and it is required to be a non-"0" digit. > > In order to recover a master seed, one needs a set of valid codex32 shares > such that: > > * All shares have the same threshold value, the same identifier, and the > same length. > * All of the share index values are distinct. > * The number of codex32 shares is exactly equal to the (common) threshold > value. > > If all the above conditions are satisfied, the <code>ms32_recover</code> > function will return a codex32 secret when its argument is the list of > codex32 shares with each share represented as a list of integers > representing the characters converted using the bech32 character table from > BIP-0173. > > <source lang="python"> > bech32_inv = [ > 0, 1, 20, 24, 10, 8, 12, 29, 5, 11, 4, 9, 6, 28, 26, 31, > 22, 18, 17, 23, 2, 25, 16, 19, 3, 21, 14, 30, 13, 7, 27, 15, > ] > > def bech32_mul(a, b): > res = 0 > for i in range(5): > res ^= a if ((b >> i) & 1) else 0 > a *= 2 > a ^= 41 if (32 <= a) else 0 > return res > > def bech32_lagrange(l, x): > n = 1 > c = [] > for i in l: > n = bech32_mul(n, i ^ x) > m = 1 > for j in l: > m = bech32_mul(m, (x if i == j else i) ^ j) > c.append(m) > return [bech32_mul(n, bech32_inv[i]) for i in c] > > def ms32_interpolate(l, x): > w = bech32_lagrange([s[5] for s in l], x) > res = [] > for i in range(len(l[0])): > n = 0 > for j in range(len(l)): > n ^= bech32_mul(w[j], l[j][i]) > res.append(n) > return res > > def ms32_recover(l): > return ms32_interpolate(l, 16) > </source> > > ===Generating Shares=== > > If we already have ''t'' valid codex32 strings such that: > > * All strings have the same threshold value ''t'', the same identifier, > and the same length > * All of the share index values are distinct > > Then we can derive additional shares with the > <code>ms32_interpolate</code> function by passing it a list of exactly > ''t'' of these codex32 strings, together with a fresh share index distinct > from all of the existing share indexes. > The newly derived share will have the provided share index. > > Once a user has generated ''n'' codex32 shares, they may discard the > codex32 secret (if it exists). > The ''n'' shares form a ''t'' of ''n'' Shamir's secret sharing scheme of a > codex32 secret. > > There are two ways to create an initial set of ''t'' valid codex32 > strings, depending on whether the user already has an existing master seed > to split. > > ====For an existing master seed==== > > Before generating shares for an existing master seed, it first must be > converted into a codex32 secret, as described above. > The conversion process consists of: > > * Choosing a threshold value ''t'' between 2 and 9, inclusive > * Choosing a 4 bech32 character identifier > ** We do not define how to choose the identifier, beyond noting that it > SHOULD be distinct for every master seed the user may need to disambiguate. > * Setting the share index to "s" > * Setting the payload to a Bech32 encoding of the master seed, padded with > arbitrary bits > * Generating a valid checksum in accordance with the Checksum section > > Along with the codex32 secret, the user must generate ''t''-1 other > codex32 shares, each with the same threshold value, the same identifier, > and a distinct share index. > The set of share indexes may be chosen arbitrarily. > The payload of each of these codex32 shares is chosen uniformly at random > such that it has the same length as the payload of the codex32 secret. > For each share, a valid checksum must be generated in accordance with the > Checksum section. > > The codex32 secret and the ''t''-1 codex32 shares form a set of ''t'' > valid codex32 strings from which additional shares can be derived as > described above. > > ====For a fresh master seed==== > > In the case that the user wishes to generate a fresh master seed, the user > chooses a threshold value ''t'' and an identifier, then generates ''t'' > random codex32 shares, using the generation procedure from the previous > section. > As before, each share must have the same threshold value ''t'', the same > identifier, and a distinct share index. > > With this set of ''t'' codex32 shares, new shares can be derived as > discussed above. This process generates a fresh master seed, whose value > can be retrieved by running the recovery process on any ''t'' of these > shares. > > ===Long codex32 Strings=== > > The 13 character checksum design only supports up to 80 data characters. > Excluding the threshold, identifier and index characters, this limits the > payload to 74 characters or 46 bytes. > While this is enough to support the 32-byte advised size of BIP-0032 > master seeds, BIP-0032 allows seeds to be up to 64 bytes in size. > We define a long codex32 string format to support these longer seeds by > defining an alternative checksum. > > <source lang="python"> > MS32_LONG_CONST = 0x43381e570bf4798ab26 > > def ms32_long_polymod(values): > GEN = [ > 0x3d59d273535ea62d897, > 0x7a9becb6361c6c51507, > 0x543f9b7e6c38d8a2a0e, > 0x0c577eaeccf1990d13c, > 0x1887f74f8dc71b10651, > ] > residue = 0x23181b3 > for v in values: > b = (residue >> 70) > residue = (residue & 0x3fffffffffffffffff) << 5 ^ v > for i in range(5): > residue ^= GEN[i] if ((b >> i) & 1) else 0 > return residue > > def ms32_verify_long_checksum(data): > return ms32_long_polymod(data) == MS32_LONG_CONST > > def ms32_create_long_checksum(data): > values = data > polymod = ms32_long_polymod(values + [0] * 15) ^ MS32_LONG_CONST > return [(polymod >> 5 * (14 - i)) & 31 for i in range(15)] > </source> > > A long codex32 string follows the same specification as a regular codex32 > string with the following changes. > > * The payload is a sequence of between 75 and 103 Bech32 characters. > * The checksum consists of 15 Bech32 characters as defined above. > > A codex32 string with a data part of 94 or 95 characters is never legal as > a regular codex32 string is limited to 93 data characters and a long > codex32 string is at least 96 characters. > > Generation of long shares and recovery of the master seed from long shares > proceeds in exactly the same way as for regular shares with the > <code>ms32_interpolate</code> function. > > The long checksum is designed to be an error correcting code that can > correct up to 4 character substitutions, up to 8 unreadable characters > (called erasures), or up to 15 consecutive erasures. > As with regular checksums we do not specify how an implementation should > implement error correction, and all our recommendations for error > correction of regular codex32 strings also apply to long codex32 strings. > > ==Rationale== > > This scheme is based on the observation that the Lagrange interpolation of > valid codewords in a BCH code will always be a valid codeword. > This means that derived shares will always have valid checksum, and a > sufficient threshold of shares with valid checksums will derive a secret > with a valid checksum. > > The header system is also compatible with Lagrange interpolation, meaning > all derived shares will have the same identifier and will have the > appropriate share index. > This fact allows the header data to be covered by the checksum. > > The checksum size and identifier size have been chosen so that the > encoding of 128-bit seeds and shares fit within 48 characters. > This is a standard size for many common seed storage formats, which has > been popularized by the 12 four-letter word format of the BIP-0039 mnemonic. > > The 13 character checksum is adequate to correct 4 errors in up to 93 > characters (80 characters of data and 13 characters of the checksum). This > is somewhat better quality than the checksum used in SLIP-0039. > > For 256-bit seeds and shares our strings are 74 characters, which fits > into the 96 character format of the 24 four-letter word format of the > BIP-0039 mnemonic, with plenty of room to spare. > > A longer checksum is needed to support up to 512-bit seeds, the longest > seed length specified in BIP-0032, as the 13 character checksum isn't > adequate for more than 80 data characters. > While we could use the 15 character checksum for both cases, we prefer to > keep the strings as short as possible for the more common cases of 128-bit > and 256-bit master seeds. > We only guarantee to correct 4 characters no matter how long the string is. > Longer strings mean more chances for transcription errors, so shorter > strings are better. > > The longest data part using the regular 13 character checksum is 93 > characters and corresponds to a 400-bit secret. > At this length, the prefix <code>MS1</code> is not covered by the checksum. > This is acceptable because the checksum scheme itself requires you to know > that the <code>MS1</code> prefix is being used in the first place. > If the prefix is damaged and a user is guessing that the data might be > using this scheme, then the user can enter the available data explicitly > using the suspected <code>MS1</code> prefix. > > ==Backwards Compatibility== > > codex32 is an alternative to BIP-0039 and SLIP-0039. > It is technically possible to derive the BIP32 master seed from seed words > encoded in one of these schemes, and then to encode this seed in codex32. > For BIP-0039 this process is irreversible, since it involves hashing the > original words. > Furthermore, the resulting seed will be 512 bits long, which may be too > large to be safely and conveniently handled. > > SLIP-0039 seed words can be reversibly converted to master seeds, so it is > possible to interconvert between SLIP-0039 and codex32. > However, SLIP-0039 '''shares''' cannot be converted to codex32 shares > because the two schemes use a different underlying field. > > The authors of this BIP do not recommend interconversion. > Instead, users who wish to switch to codex32 should generate a fresh seed > and sweep their coins. > > ==Reference Implementation== > > * [https://secretcodex32.com/docs/2023-02-14--bw.ps Reference PostScript > Implementation] > * FIXME add Python implementation > * FIXME add Rust implementation > > ==Test Vectors== > > ===Test vector 1=== > > This example shows the codex32 format, when used without splitting the > secret into any shares. > The data part contains 26 Bech32 characters, which corresponds to 130 > bits. We truncate the last two bits in order to obtain a 128-bit master > seed. > > codex32 secret (Bech32): > <code>ms10testsxxxxxxxxxxxxxxxxxxxxxxxxxx4nzvca9cmczlw</code> > > Master secret (hex): <code>318c6318c6318c6318c6318c6318c631</code> > > * human-readable part: <code>ms</code> > * separator: <code>1</code> > * k value: <code>0</code> (no secret splitting) > * identifier: <code>test</code> > * share index: <code>s</code> (the secret) > * data: <code>xxxxxxxxxxxxxxxxxxxxxxxxxx</code> > * checksum: <code>4nzvca9cmczlw</code> > > ===Test vector 2=== > > This example shows generating a new master seed using "random" codex32 > shares, as well as deriving an additional codex32 share, using ''k''=2 and > an identifier of <code>NAME</code>. > Although codex32 strings are canonically all lowercase, it's also valid to > use all uppercase. > > Share with index <code>A</code>: > <code>MS12NAMEA320ZYXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM</code> > > Share with index <code>C</code>: > <code>MS12NAMECACDEFGHJKLMNPQRSTUVWXYZ023FTR2GDZMPY6PN</code> > > * Derived share with index <code>D</code>: > <code>MS12NAMEDLL4F8JLH4E5VDVULDLFXU2JHDNLSM97XVENRXEG</code> > * Secret share with index <code>S</code>: > <code>MS12NAMES6XQGUZTTXKEQNJSJZV4JV3NZ5K3KWGSPHUH6EVW</code> > * Master secret (hex): <code>d1808e096b35b209ca12132b264662a5</code> > > Note that per BIP-0173, the lowercase form is used when determining a > character's value for checksum purposes. > In particular, given an all uppercase codex32 string, we still use > lowercase <code>ms</code> as the human-readable part during checksum > construction. > > ===Test vector 3=== > > This example shows splitting an existing 128-bit master seed into "random" > codex32 shares, using ''k''=3 and an identifier of <code>cash</code>. > We appended two zero bits in order to obtain 26 Bech32 characters (130 > bits of data) from the 128-bit master seed. > > Master secret (hex): <code>ffeeddccbbaa99887766554433221100</code> > > Secret share with index <code>s</code>: > <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qqjzqud4m0d6nln</code> > > Share with index <code>a</code>: > <code>ms13casha320zyxwvutsrqpnmlkjhgfedca2a8d0zehn8a0t</code> > > Share with index <code>c</code>: > <code>ms13cashcacdefghjklmnpqrstuvwxyz023949xq35my48dr</code> > > * Derived share with index <code>d</code>: > <code>ms13cashd0wsedstcdcts64cd7wvy4m90lm28w4ffupqs7rm</code> > * Derived share with index <code>e</code>: > <code>ms13casheekgpemxzshcrmqhaydlp6yhms3ws7320xyxsar9</code> > * Derived share with index <code>f</code>: > <code>ms13cashf8jh6sdrkpyrsp5ut94pj8ktehhw2hfvyrj48704</code> > > Any three of the five shares among <code>acdef</code> can be used to > recover the secret. > > Note that the choice to append two zero bits was arbitrary, and any of the > following four secret shares would have been valid choices. > However, each choice would have resulted in a different set of derived > shares. > > * <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qqjzqud4m0d6nln</code> > * <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qpte35dvzkjpt0r</code> > * <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qzfatvdwq5692k6</code> > * <code>ms13cashsllhdmn9m42vcsamx24zrxgs3qrsx6ydhed97jx2</code> > > ===Test vector 4=== > > This example shows converting a 256-bit secret into a codex32 secret, > without splitting the secret into any shares. > We appended four zero bits in order to obtain 52 Bech32 characters (260 > bits of data) from the 256-bit secret. > > 256-bit secret (hex): > <code>ffeeddccbbaa99887766554433221100ffeeddccbbaa99887766554433221100</code> > > * codex32 secret: > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqqtum9pgv99ycma</code> > > Note that the choice to append four zero bits was arbitrary, and any of > the following sixteen codex32 secrets would have been valid: > > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqqtum9pgv99ycma</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqpj82dp34u6lqtd</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqzsrs4pnh7jmpj5</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqrfcpap2w8dqezy</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqy5tdvphn6znrf0</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq9dsuypw2ragmel</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqx05xupvgp4v6qx</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq8k0h5p43c2hzsk</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqgum7hplmjtr8ks</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqf9q0lpxzt5clxq</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq28y48pyqfuu7le</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqt7ly0paesr8x0f</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqvrvg7pqydv5uyz</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqd6hekpea5n0y5j</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyqwcnrwpmlkmt9dt</code> > * > <code>ms10leetsllhdmn9m42vcsamx24zrxgs3qrl7ahwvhw4fnzrhve25gvezzyq0pgjxpzx0ysaam</code> > > ===Test vector 5=== > > This example shows generating a new 512-bit master seed using "random" > codex32 characters and appending a checksum. > The payload contains 103 Bech32 characters, which corresponds to 515 bits. > The last three bits are discarded when converting to a 512-bit master seed. > > This is an example of a '''Long codex32 String'''. > > * Secret share with index <code>S</code>: > <code>MS100C8VSM32ZXFGUHPCHTLUPZRY9X8GF2TVDW0S3JN54KHCE6MUA7LQPZYGSFJD6AN074RXVCEMLH8WU3TK925ACDEFGHJKLMNPQRSTUVWXY06FHPV80UNDVARHRAK</code> > * Master secret (hex): > <code>dc5423251cb87175ff8110c8531d0952d8d73e1194e95b5f19d6f9df7c01111104c9baecdfea8cccc677fb9ddc8aec5553b86e528bcadfdcc201c17c638c47e9</code> > > ==Appendix== > > ===Mathematical Companion=== > > Below we use the Bech32 character set to denote values in GF[32]. > In Bech32, the letter <code>Q</code> denotes zero and the letter > <code>P</code> denotes one. > The digits <code>0</code> and <code>2</code> through <code>9</code> do > ''not'' denote their numeric values. > They are simply elements of GF[32]. > > The generating polynomial for our BCH code is as follows. > > We extend GF[32] to GF[1024] by adjoining a primitive cube root of unity, > <code>ζ</code>, satisfying <code>ζ^2 = ζ + P</code>. > > We select <code>β := G ζ</code> which has order 93, and construct the > product <code>(x - β^i)</code> for <code>i</code> in <code>{17, 20, 46, 49, > 52, 77, 78, 79, 80, 81, 82, 83, 84}</code>. > The resulting polynomial is our generating polynomial for our 13 character > checksum: > > x^13 + E x^12 + M x^11 + 3 x^10 + G x^9 + Q x^8 + E x^7 + E x^6 + E > x^5 + L x^4 + M x^3 + C x^2 + S x + S > > For our long checksum, we select <code>γ := E + X ζ</code>, which has > order 1023, and construct the product <code>(x - γ^i)</code> for > <code>i</code> in <code>{32, 64, 96, 895, 927, 959, 991, 1019, 1020, 1021, > 1022, 1023, 1024, 1025, 1026}</code>. > The resulting polynomial is our generating polynomial for our 15 character > checksum for long strings: > > x^15 + 0 x^14 + 2 x^13 + E x^12 + 6 x^11 + F x^10 + E x^9 + 4 x^8 + X > x^7 + H x^6 + 4 x^5 + X x^4 + 9 x^3 + K x^2 + Y x^1 + H > > (Reminder: the character <code>0</code> does ''not'' denote the zero of > the field.) > > -----END BIP----- > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > -- Best Regards / S pozdravom, Pavol "Stick" Rusnak Co-Founder, SatoshiLabs [-- Attachment #2: Type: text/html, Size: 66959 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-16 11:50 ` Pavol Rusnak @ 2023-02-16 13:49 ` Andrew Poelstra 2023-02-19 20:13 ` David A. Harding 0 siblings, 1 reply; 15+ messages in thread From: Andrew Poelstra @ 2023-02-16 13:49 UTC (permalink / raw) To: Pavol Rusnak, Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 3397 bytes --] On Thu, Feb 16, 2023 at 12:50:12PM +0100, Pavol Rusnak via bitcoin-dev wrote: > Hi! > > The BIP states that its only advantage over SLIP-0039, which has been used > in production for nearly three years (in at at least 3 SW/HW wallet > implementations), is that it aims to be simple enough for hand computation. > However, the BIP also indicates that "details of hand computation are > outside the scope of this standard, and implementers do not need to be > concerned with this possibility." Therefore, I am curious about how > significant this advantage over SLIP-0039 really is. If hand computation is > not straightforward and there are no other substantial advantages over > SLIP-0039, I cannot help but feel that this BIP is simply a result of > not-invented-here syndrome, but please correct me if I am wrong. > In my view, the hand computation is actually the main benefit of this scheme. The process *is* straightforward, but tedious enough (and the security benefits obscure enough, though they really shouldn't be... "computers are opaque and untrustworthy" should be a common sentiment) that it's hard to expect more than a small absolute number of users to actually do it. But for the purpose of the *standard*, what is important is that it is possible to implement and use this within a normal hww workflow. This is important for hand-computing users who know that their coins will not die with them (since the 'standard' has fallen into obscurity), and important for "normal" users who have the option to seamlessly switch over to hand computation as the BTC price goes up or the world becomes scarier. For what it's worth, the draft lists several benefits over SLIP-0039. I agree that none of them are particularly strong [1], and even together they probably wouldn't meet the threshold to take the time to write a standard, but I assure you the motivation was not NIH :). > Keep in mind that the encoded shares in SLIP-0039 consist of exactly 200 or > 330 bits, both of which are divisible by 5. This makes it straightforward > to encode them as Bech32 strings. > This is true! And very convenient for people who may want to simply "layer on" the codex32 checksum/splitting logic onto their SLIP39 words. They can use a lookup table to do the conversion, spend years or whataever doing hand-computation on them, and then use a lookup table to go back. [1] One listed reason is that "a SLIP is not a BIP". I have heard people speculate that this is one reason SLIP-0039 is not nearly as widespread as BIP-0039, even though it is objectively a far better standard. I'm unsure whether I believe this, but "there is no other BIP" does seem like a good reason for BIP-0039's continued dominance. At the very least, it means that on BIP-0039 itself we have nothing that we could say "supercedes" or "is recommended instead of" the BIP. See https://github.com/bitcoin/bips/pull/1413 So it's something of an aside, but I think it would probably be good for the ecosystem (though maybe bad for this BIP's prospects :)) if you would request a BIP number for SLIP-0039. -- Andrew Poelstra Director of Research, Blockstream Email: apoelstra at wpsoftware.net Web: https://www.wpsoftware.net/andrew The sun is always shining in space -Justin Lewis-Webster [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-16 13:49 ` Andrew Poelstra @ 2023-02-19 20:13 ` David A. Harding 2023-02-19 22:12 ` Andrew Poelstra 2023-02-22 17:24 ` Russell O'Connor 0 siblings, 2 replies; 15+ messages in thread From: David A. Harding @ 2023-02-19 20:13 UTC (permalink / raw) To: Andrew Poelstra, Bitcoin Protocol Discussion On 2023-02-16 03:49, Andrew Poelstra via bitcoin-dev wrote: > the draft lists several benefits over SLIP-0039. The only benefit over SLIP39 that I see explicitly mentioned in the draft BIP is "simple enough for hand computation". In the FAQ[1] on the project's website, I see some additional reasons: | This scheme is essentially the same as SLIP39, with the following differences: | | - The checksum is longer, slightly stronger, and designed to be | computable by hand. | | - Our encoding is more compact, giving us room for a bit of more | metadata, which is also designed to be readable by hand. | | - Unlike SLIP39, we do not support passphrases or hardening of any | form. | | - Unlike SLIP39, we have no hardware wallet support. But we hope that | will change! From having perused the extended documentation myself, I think I would personally note the following differences. - Alphabet: Codex32 uses the bech32 alphabet rather than SLIP39's alphabet consisting of English words. The benefit to human-language words is easier memorization for those proficient in the particular language (in this case, SLIP39 only allows the use of English). A disadvantage, IMO, is that it encourages the practice of memorization (which does have a few advantages but also a lot of drawbacks). Interestingly, Codex32 addresses what I think is the main problems of memorization: difficult-to-prove successful recollection. Someone who wants to reliably keep seed-related material only in their head needs to practice recalling it on a regular basis, but for BIP39, SLIP39, Aezeed, etc... there's no way for them to confirm they successfully recalled it short of going through the entire recovery process; they probably just judge how confident they feel about the recollection and assume that feeling like they recalled it correctly is the same thing as recalling it correctly. Codex32 allows the individual to periodically perform their recollection on paper in a private room without electronics and use nothing but a pen and some loookup tables (or a paper device) to verify that they recalled the string correctly (and its checksum can help with correcting up to several errors, although you might need a computer for error location and correction assistance). - Hierarchy: Codex32 does not natively provide support for nested SSSS whereas SLIP39 does. E.g., in SLIP39, you can require 2-of-3 for {me, family, friends} where me is 2-of-3 {fire_safe, bank_safe, buried_in_woods}, family is 1-of-3 {alice, bob, carol}, and friends are 2-of-5 {d, e, f, g, h}. I assume you can do the same with Codex32 by using the share for one level as the secret for the next level, although this is not described in the protocol. - Versioning: Codex32's metadata can store version information for wallets that use implicit BIP32 paths (e.g. BIP44/49/84/86), although this would cut into the space available for users to set their own metadata and it is not specified in the draft BIP. SLIP39 also doesn't specify anything about implicit path versioning and, AFAICT, doesn't have any room to store such metadata without reducing seed entropy. - Plausible deniability dummy wallets: Codex32 doesn't support this; SLIP39 does. Much has been written by other people about whether dummy wallets are a good idea or not, with strong opinions on both sides, so maybe we can just leave it at that. --- When I first saw the post about this, it was unclear to me that it was a serious project, but I've become increasingly interested as I researched it. I'm not personally that interested in generating entropy from dice or encoding shares by hand---it's already imperative that I acquire a trustworthy computer and load it with trustworthy software in order to use my seed securely, so I might as well have it generate my seeds and my recovery codes for me. What really did catch my attention, but which was kind of buried in the project documentation, is the ability to verify the integrity of each share independently without using a computer. For example, if I store a share with some relative who lives thousands of kilometers away, I'll be able to take that share out of its tamper-evident bag on my annual holiday visit, verify that I can still read it accurately by validating its checksum, and put it into a new bag for another year. For this procedure, I don't need to bring copies of any of my other shares, allowing them (and my seed) to stay safe. --- I do have one question after watching an excellent video[2] about the motivation for this system. In the video, one of the threat models described is a disarrangement of the words in a metal backup system. The implication seems to be that this would be an accidental disarrangement, which obviously the Codex32 checksum would catch during periodic offline verification. But what about deliberate modification of a recovery code? For example, Bob doesn't keep his seed loaded on any computer; it only exists in Codex32 shares which Bob plans to combine together in 20 years when he retires, although he makes regular deposits to the pubkeys derived from the seed's master xpub. Mallory is able to obtain access to Bob's shares, allowing her to immediately steal his current funds---but also allowing her to replace them with similar-looking shares with the same metadata and valid checksums so that Bob continues making deposits to the wallet. I'm curious about whether there's a way to prevent this attack without otherwise compromising the properties of the code? For example, some extra data that Bob can carry around (or memorize) for verifying the shares haven't changed, but which is not otherwise needed for recovery (so there's no problem if it's lost). Thanks, -Dave [1] https://secretcodex32.com/faq/index.html [2] https://www.youtube.com/watch?v=kf48oPoiHX0&list=PLyOGyBytgcuQLi9DC5g88DOEGnqBDPmq1&index=2 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-19 20:13 ` David A. Harding @ 2023-02-19 22:12 ` Andrew Poelstra 2023-02-19 23:05 ` Christopher Allen ` (2 more replies) 2023-02-22 17:24 ` Russell O'Connor 1 sibling, 3 replies; 15+ messages in thread From: Andrew Poelstra @ 2023-02-19 22:12 UTC (permalink / raw) To: David A. Harding; +Cc: Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 5425 bytes --] On Sun, Feb 19, 2023 at 10:13:33AM -1000, David A. Harding wrote: > On 2023-02-16 03:49, Andrew Poelstra via bitcoin-dev wrote: > > the draft lists several benefits over SLIP-0039. > > The only benefit over SLIP39 that I see explicitly mentioned in the > draft BIP is "simple enough for hand computation". In the FAQ[1] on the > project's website, I see some additional reasons: > Oh, you're right! I think we removed this text in one of our revisions. I'll see if it makes sense to put it back. > | This scheme is essentially the same as SLIP39, with the following > differences: > | > | - The checksum is longer, slightly stronger, and designed to be > | computable by hand. > | > | - Our encoding is more compact, giving us room for a bit of more > | metadata, which is also designed to be readable by hand. > | > | - Unlike SLIP39, we do not support passphrases or hardening of any > | form. > | > | - Unlike SLIP39, we have no hardware wallet support. But we hope that > | will change! > These are roughly the benefits -- a more compact encoding which is always a fixed width. We also list "not supporting features such as passphrases" as a benefit, for users who don't need/want this. > <snip> > > When I first saw the post about this, it was unclear to me that it was a > serious project, but I've become increasingly interested as I researched > it. I'm not personally that interested in generating entropy from dice > or encoding shares by hand---it's already imperative that I acquire a > trustworthy computer and load it with trustworthy software in order to > use my seed securely, so I might as well have it generate my seeds and my > recovery codes for me. > Yes, we've been a bit coy about how serious this project is, because on its face it's such a silly thing. But for my part, I've been using it for real coins for over a year and I consider it to be serious. > What really did catch my attention, but which was kind of buried in the > project documentation, is the ability to verify the integrity of each > share independently without using a computer. For example, if I store a > share with some relative who lives thousands of kilometers away, I'll be > able to take that share out of its tamper-evident bag on my annual > holiday visit, verify that I can still read it accurately by validating > its checksum, and put it into a new bag for another year. For this > procedure, I don't need to bring copies of any of my other shares, > allowing them (and my seed) to stay safe. > This is good feedback. I strongly agree that this is the big selling point for this -- that you can vet shares/seeds which *aren't* being actively used, without exposing them to the sorts of threats associated with active use. We should make this more prominent in the BIP motivation. > > I do have one question after watching an excellent video[2] about the > motivation for this system. In the video, one of the threat models > described is a disarrangement of the words in a metal backup system. > The implication seems to be that this would be an accidental > disarrangement, which obviously the Codex32 checksum would catch during > periodic offline verification. But what about deliberate modification > of a recovery code? For example, Bob doesn't keep his seed loaded on > any computer; it only exists in Codex32 shares which Bob plans to > combine together in 20 years when he retires, although he makes regular > deposits to the pubkeys derived from the seed's master xpub. Mallory is > able to obtain access to Bob's shares, allowing her to immediately steal > his current funds---but also allowing her to replace them with > similar-looking > shares with the same metadata and valid checksums so that Bob > continues making deposits to the wallet. > > I'm curious about whether there's a way to prevent this attack without > otherwise compromising the properties of the code? For example, some > extra data that Bob can carry around (or memorize) for verifying the > shares haven't changed, but which is not otherwise needed for recovery > (so there's no problem if it's lost). > Unfortunately not, as near as I can tell ... one way to think of this is that Alice can flip a lot of random tiles then "error correct" it to get a new valid, but incorrect, seed. So as long as we support error correction it'll be possible to wreck seeds in this way. It's actually even worse than this ... as long as there's a clearly defined "checksum" at the end of a share, Alice will be able to mangele tiles and then just re-compute the checksum at the end. So what we really need to prevent this is something like a MAC: where Bob has a secret value which gets input into the checksum somehow, which Alice can't create valid checksums without knowing. Unfortunately I don't see any way to do this with linear codes. With a hash-based "checksum" a la BIP39 it would definitely be possible, but of course, not hand-computable. BTW, to execute this attack Alice doesn't need to compromise *all* the shares. Just enough that Bob no longer has threshold-many un-tampered ones left. -- Andrew Poelstra Director of Research, Blockstream Email: apoelstra at wpsoftware.net Web: https://www.wpsoftware.net/andrew The sun is always shining in space -Justin Lewis-Webster [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-19 22:12 ` Andrew Poelstra @ 2023-02-19 23:05 ` Christopher Allen 2023-02-20 0:52 ` Russell O'Connor 2023-02-22 16:29 ` Peter Todd 2 siblings, 0 replies; 15+ messages in thread From: Christopher Allen @ 2023-02-19 23:05 UTC (permalink / raw) To: Andrew Poelstra, Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 1061 bytes --] An easy but possibly very important tip: If you use only UPPER CASE alpha and numbers in codex32, and avoid most punctuation, it makes QR rendering of it significantly smaller. This is because the QR code to the ISO SPEC, when seeing lowercase, assumes the value is binary, then converts it to a two byte value. If instead, the codex32 is all upper, and it uses only the 45 allowed characters (see https://www.thonky.com/qr-code-tutorial/alphanumeric-table) , it will leave it single byte and try to compress it. Of course it doesn’t compress well, but that is OK because it at least didn’t double the size first. See our research on this topic at https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2020-003-uri-binary-compatibility.md A superior QR codec can do better (see our https://github.com/BlockchainCommons/QRCodeGenerator or https://www.nayuki.io/page/qr-code-generator-library) but many platforms and more basic QR codecs will double the size of the QR if you have any lower case. — Christopher Allen [-- Attachment #2: Type: text/html, Size: 1662 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-19 22:12 ` Andrew Poelstra 2023-02-19 23:05 ` Christopher Allen @ 2023-02-20 0:52 ` Russell O'Connor 2023-02-22 16:29 ` Peter Todd 2 siblings, 0 replies; 15+ messages in thread From: Russell O'Connor @ 2023-02-20 0:52 UTC (permalink / raw) To: Andrew Poelstra, David A. Harding, Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 2268 bytes --] On Sun, Feb 19, 2023 at 5:13 PM Andrew Poelstra via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > On Sun, Feb 19, 2023 at 10:13:33AM -1000, David A. Harding wrote: > > I'm curious about whether there's a way to prevent this attack without > > otherwise compromising the properties of the code? For example, some > > extra data that Bob can carry around (or memorize) for verifying the > > shares haven't changed, but which is not otherwise needed for recovery > > (so there's no problem if it's lost). > > > > Unfortunately not, as near as I can tell ... one way to think of this is > that Alice can flip a lot of random tiles then "error correct" it to get > a new valid, but incorrect, seed. So as long as we support error > correction it'll be possible to wreck seeds in this way. > > It's actually even worse than this ... as long as there's a clearly > defined "checksum" at the end of a share, Alice will be able to mangele > tiles and then just re-compute the checksum at the end. > > So what we really need to prevent this is something like a MAC: where > Bob has a secret value which gets input into the checksum somehow, which > Alice can't create valid checksums without knowing. Unfortunately I > don't see any way to do this with linear codes. With a hash-based > "checksum" a la BIP39 it would definitely be possible, but of course, > not hand-computable. > Speaking off the cuff and as a non-cryptographer (i.e do NOT rush off and do this without any vetting) but Christopher Allen once referred me to an cypher tile set called LS47 <https://gitea.blesmrt.net/exa/ls47>. If we set aside the cypertext, I suspect we can form a MAC by recording some random initial tile configuration, running the LS47 algorithm, and recording the final tile configuration. These records are not sensitive as (presumably!) the share data is not recoverable from just knowing these two configurations. So one can keep these records with you, digitally sign them or whatever, and then take them to your share on a regular basis to rerun the LS47 algorithm to see if you still get the same final state from the initial state. Perhaps something more specific to Bech32 could be designed, but otherwise this (alleged) MAC process isn't Codex32 specific. [-- Attachment #2: Type: text/html, Size: 2972 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-19 22:12 ` Andrew Poelstra 2023-02-19 23:05 ` Christopher Allen 2023-02-20 0:52 ` Russell O'Connor @ 2023-02-22 16:29 ` Peter Todd 2023-02-22 19:01 ` Russell O'Connor 2 siblings, 1 reply; 15+ messages in thread From: Peter Todd @ 2023-02-22 16:29 UTC (permalink / raw) To: Andrew Poelstra, Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 2820 bytes --] On Sun, Feb 19, 2023 at 10:12:51PM +0000, Andrew Poelstra via bitcoin-dev wrote: > > What really did catch my attention, but which was kind of buried in the > > project documentation, is the ability to verify the integrity of each > > share independently without using a computer. For example, if I store a > > share with some relative who lives thousands of kilometers away, I'll be > > able to take that share out of its tamper-evident bag on my annual > > holiday visit, verify that I can still read it accurately by validating > > its checksum, and put it into a new bag for another year. For this > > procedure, I don't need to bring copies of any of my other shares, > > allowing them (and my seed) to stay safe. > > > > This is good feedback. I strongly agree that this is the big selling > point for this -- that you can vet shares/seeds which *aren't* being > actively used, without exposing them to the sorts of threats associated > with active use. > > We should make this more prominent in the BIP motivation. I don't think that use-case is a good selling point. The checksum that Codex32 uses is much more complex than necessary if you are simply verifying a share by itself. A *much* simpler approach would be to use a simple mod N = 0 checksum, either by creating the seed such that each share passes, or by just storing an additional word/symbol with the seed in such a way that sum(words) mod N = 0 passes. This approach is not only possible to compute by hand with a word/symbol->number lookup table, and pen and paper or a calculator. It's so simple they could probably *remember* how to do it themselves. Secondly, if all shares have mod N checksums, it may be sufficient for everyone to write down the checksums of the *other* shares, to verify they are the correct ones and a different (otherwise correct) share hasn't accidentally been substituted. Indeed, with some brute forcing and small checksums, I'd expect it to be mathematically possible to generate Shamir's secret sharing shards such that every shard can share the *same* checksum. In which case the share verification procedure would be to simply ask every share holder to verify the checksum manually using the mod N procedure, and then verify that each share holder has the same checksum. This would be less error prone in terms of leaking information accidentally if the checksum was obviously *not* part of the share: eg by encoding the share with words, and the checksum with a number. Obviously, small checksums aren't fool proof. But we're probably better off creating a relatively easy procedure with a 1-in-1000 chance of an error going undetected than a complex procedure that people don't actually do at all. -- https://petertodd.org 'peter'[:-1]@petertodd.org [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-22 16:29 ` Peter Todd @ 2023-02-22 19:01 ` Russell O'Connor 2023-02-23 3:30 ` Russell O'Connor 0 siblings, 1 reply; 15+ messages in thread From: Russell O'Connor @ 2023-02-22 19:01 UTC (permalink / raw) To: Peter Todd, Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 4140 bytes --] After some poking around at the math, I do see that the 13 character generator (for regular sized shares) is reasonably "smooth", having roots at T{11}, S{16}, and C{24}. This means we could build a "quick check" worksheet to evaluate the string modulo (x - T) to verify a 5 bit checksum, whose operation would be similar to the existing checksum worksheet in structure but significantly less work. Perhaps 5 bits is too short, and it is more reasonable working modulo (x - T)*(x - S) to get a 10 bit checksum. A worksheet for a 15 bit checksum is also an option, and possibly others well depending on the size of the other factors. I think this process is would be about as simple as any other comparable hand operated checksum over the bech32 alphabet would be. I haven't looked into the smoothness of the long generator for seeds that are greater than 400 bits. If it isn't smooth and smoother options are available, perhaps it should be changed. On Wed, Feb 22, 2023 at 11:29 AM Peter Todd via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > On Sun, Feb 19, 2023 at 10:12:51PM +0000, Andrew Poelstra via bitcoin-dev > wrote: > > > What really did catch my attention, but which was kind of buried in the > > > project documentation, is the ability to verify the integrity of each > > > share independently without using a computer. For example, if I store > a > > > share with some relative who lives thousands of kilometers away, I'll > be > > > able to take that share out of its tamper-evident bag on my annual > > > holiday visit, verify that I can still read it accurately by validating > > > its checksum, and put it into a new bag for another year. For this > > > procedure, I don't need to bring copies of any of my other shares, > > > allowing them (and my seed) to stay safe. > > > > > > > This is good feedback. I strongly agree that this is the big selling > > point for this -- that you can vet shares/seeds which *aren't* being > > actively used, without exposing them to the sorts of threats associated > > with active use. > > > > We should make this more prominent in the BIP motivation. > > I don't think that use-case is a good selling point. The checksum that > Codex32 > uses is much more complex than necessary if you are simply verifying a > share by > itself. > > A *much* simpler approach would be to use a simple mod N = 0 checksum, > either > by creating the seed such that each share passes, or by just storing an > additional word/symbol with the seed in such a way that sum(words) mod N = > 0 > passes. This approach is not only possible to compute by hand with a > word/symbol->number lookup table, and pen and paper or a calculator. It's > so > simple they could probably *remember* how to do it themselves. > > > Secondly, if all shares have mod N checksums, it may be sufficient for > everyone > to write down the checksums of the *other* shares, to verify they are the > correct ones and a different (otherwise correct) share hasn't accidentally > been > substituted. > > Indeed, with some brute forcing and small checksums, I'd expect it to be > mathematically possible to generate Shamir's secret sharing shards such > that > every shard can share the *same* checksum. In which case the share > verification > procedure would be to simply ask every share holder to verify the checksum > manually using the mod N procedure, and then verify that each share holder > has > the same checksum. This would be less error prone in terms of leaking > information accidentally if the checksum was obviously *not* part of the > share: > eg by encoding the share with words, and the checksum with a number. > > Obviously, small checksums aren't fool proof. But we're probably better off > creating a relatively easy procedure with a 1-in-1000 chance of an error > going > undetected than a complex procedure that people don't actually do at all. > > -- > https://petertodd.org 'peter'[:-1]@petertodd.org > _______________________________________________ > bitcoin-dev mailing list > bitcoin-dev@lists.linuxfoundation.org > https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev > [-- Attachment #2: Type: text/html, Size: 5183 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-22 19:01 ` Russell O'Connor @ 2023-02-23 3:30 ` Russell O'Connor 2023-02-23 16:36 ` Russell O'Connor 2023-02-23 18:26 ` Russell O'Connor 0 siblings, 2 replies; 15+ messages in thread From: Russell O'Connor @ 2023-02-23 3:30 UTC (permalink / raw) To: Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 7282 bytes --] After some consultation, I now see that generators for all degree 2 BCH codes, such as ours, are smooth and factor into quadratic and linear components. Anyhow the upshot of all this is that you can perform a "quickcheck" verification of the codex32 strings for whatever size of verification you want to do, 1 character, 2 characters, 3 characters, upto the full 13 characters. Each of these partial verifications will have BCH properties. A 1 character quickchecksum will guarantee to detect any 1 character error. A 3 character quickchecksum will guarantee to detect any 2 character error, etc. There remains a 1 in 32^n chance of failing to detect larger numbers of errors where n is the size of your quickcheck. To illustrate, let's consider a quickcheck of size 2. This can detect any 1 character error and will only have a 1/1024 chance of failing to detect other random errors. Let's take the second test vector as our example: " MS12NAMEA320ZYXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM" You start in a specified initial state with a pair of bech32 characters. For MS1 strings and a size 2 quickcheck it would be the pair of Bech32 characters 'AS'. Next we "add" the first character after the prefix, which is '2' by using the addition volvelle or lookup table. "Adding" '2' to 'S' yields '6' and our state becomes 'A6'. Next we have to look up 'A6' in a lookup table. This table is too big to fit in the margin of this email, so I will have to omit it. But it would have an entry mapping 'A6' -> 'QM'. Our state becomes 'QM' From this point we have an even number of remaining characters in the input string and we can work 2 characters at a time. We "add" the next two data characters from our string, which are 'NA'. Again, using the volvelle or lookup table we get that adding 'N' to 'Q' yields 'N', and adding 'A' to 'M' yields 'X'. So our state is now 'NX' Next we look up 'NX' in this table I haven't given you and we will find an entry mapping 'NX' -> 'DX', making 'DX' our new state. We keep repeating this process alternating between adding pairs of characters and using this unstated lookup table all the way until the end where we will reach a final state which will be 'H9'. If you follow this procedure with any string (upto 400 bit master seeds) you will always end up in the state 'H9'. A specialized worksheet would help guide the process making the process easier to follow. This process is somewhat close to Peter Todd's suggestion of using a lookup table on "words", which in this case would be pairs of bech32 characters, and adding values together. The catch is that the addition is done with Bech32 addition rather than calculator addition, which I accept is a moderately large catch. Anyhow, the point is that you can do this sort of partial verification by hand to whatever degree you like, if you are in a rush and are willing to accept larger chances of failing to catch random errors. On Wed, Feb 22, 2023 at 2:01 PM Russell O'Connor <roconnor@blockstream.com> wrote: > After some poking around at the math, I do see that the 13 character > generator (for regular sized shares) is reasonably "smooth", having roots > at T{11}, S{16}, and C{24}. > > This means we could build a "quick check" worksheet to evaluate the string > modulo (x - T) to verify a 5 bit checksum, whose operation would be similar > to the existing checksum worksheet in structure but significantly less work. > > Perhaps 5 bits is too short, and it is more reasonable working modulo (x - > T)*(x - S) to get a 10 bit checksum. A worksheet for a 15 bit checksum is > also an option, and possibly others well depending on the size of the other > factors. I think this process is would be about as simple as any other > comparable hand operated checksum over the bech32 alphabet would be. > > I haven't looked into the smoothness of the long generator for seeds that > are greater than 400 bits. If it isn't smooth and smoother options are > available, perhaps it should be changed. > > On Wed, Feb 22, 2023 at 11:29 AM Peter Todd via bitcoin-dev < > bitcoin-dev@lists.linuxfoundation.org> wrote: > >> On Sun, Feb 19, 2023 at 10:12:51PM +0000, Andrew Poelstra via bitcoin-dev >> wrote: >> > > What really did catch my attention, but which was kind of buried in >> the >> > > project documentation, is the ability to verify the integrity of each >> > > share independently without using a computer. For example, if I >> store a >> > > share with some relative who lives thousands of kilometers away, I'll >> be >> > > able to take that share out of its tamper-evident bag on my annual >> > > holiday visit, verify that I can still read it accurately by >> validating >> > > its checksum, and put it into a new bag for another year. For this >> > > procedure, I don't need to bring copies of any of my other shares, >> > > allowing them (and my seed) to stay safe. >> > > >> > >> > This is good feedback. I strongly agree that this is the big selling >> > point for this -- that you can vet shares/seeds which *aren't* being >> > actively used, without exposing them to the sorts of threats associated >> > with active use. >> > >> > We should make this more prominent in the BIP motivation. >> >> I don't think that use-case is a good selling point. The checksum that >> Codex32 >> uses is much more complex than necessary if you are simply verifying a >> share by >> itself. >> >> A *much* simpler approach would be to use a simple mod N = 0 checksum, >> either >> by creating the seed such that each share passes, or by just storing an >> additional word/symbol with the seed in such a way that sum(words) mod N >> = 0 >> passes. This approach is not only possible to compute by hand with a >> word/symbol->number lookup table, and pen and paper or a calculator. It's >> so >> simple they could probably *remember* how to do it themselves. >> >> >> Secondly, if all shares have mod N checksums, it may be sufficient for >> everyone >> to write down the checksums of the *other* shares, to verify they are the >> correct ones and a different (otherwise correct) share hasn't >> accidentally been >> substituted. >> >> Indeed, with some brute forcing and small checksums, I'd expect it to be >> mathematically possible to generate Shamir's secret sharing shards such >> that >> every shard can share the *same* checksum. In which case the share >> verification >> procedure would be to simply ask every share holder to verify the checksum >> manually using the mod N procedure, and then verify that each share >> holder has >> the same checksum. This would be less error prone in terms of leaking >> information accidentally if the checksum was obviously *not* part of the >> share: >> eg by encoding the share with words, and the checksum with a number. >> >> Obviously, small checksums aren't fool proof. But we're probably better >> off >> creating a relatively easy procedure with a 1-in-1000 chance of an error >> going >> undetected than a complex procedure that people don't actually do at all. >> >> -- >> https://petertodd.org 'peter'[:-1]@petertodd.org >> _______________________________________________ >> bitcoin-dev mailing list >> bitcoin-dev@lists.linuxfoundation.org >> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >> > [-- Attachment #2: Type: text/html, Size: 9401 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-23 3:30 ` Russell O'Connor @ 2023-02-23 16:36 ` Russell O'Connor 2023-02-23 18:26 ` Russell O'Connor 1 sibling, 0 replies; 15+ messages in thread From: Russell O'Connor @ 2023-02-23 16:36 UTC (permalink / raw) To: Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 10192 bytes --] Sorry for the repeated replies, but I would like to make one more remark regarding the 1 character "quick check". Because the 1 character "quick check" state is so small, the procedure becomes simplified to just using a single table. You start with the specified initial state, which would be the bech32 character '9', and then you have a lookup mapping (<current state>, <next input character>) -> <next state>. You go through the table for each character after the prefix, updating the state as you go along. ('9','2') -> '0', then ('0','N') -> '4', and so on until you reach the final state which should be '5'. If you like volvelles, one could be designed to implement this lookup table. However, I do want to note that an adjustment could be made to the codex32 generator so that this 1 character "quick check" table would become identical to the Bech32 addition table. In other words the 1 character quick check would become the same as adding up all the characters and checking that you get the required final constant. If this change were free to make, I would probably make it. However such an adjustment would come at a cost. The current generator was chosen to have three identical coefficients in a row (you can find the generator in the appendix of the draft BIP). This special property slightly eases the computation needed when creating checksums by hand (no compromise to the quality of the checksum itself). If we made the above adjustment to the codex32 generator, we would lose this property of having three identical coefficients in a row. Therefore, I am pretty hesitant to make this adjustment. Firstly the 1 character quick check is simply too small to be safely used. While it does guarantee to detect single character errors, it has a 1 in 32 chance of failing to detect more errors. I think a 3% failure rate is pretty bad, and would definitely recommend people performing quick checks use a minimum size of 2 (which has a 0.1% failure rate). Secondly the difference between using the addition table/volvelle versus a specific table/volvelle for the purpose of performing 1 character quick checks (which you ought not to be doing anyways) is pretty minimal. The addition table is possibly slightly less error prone to use because it is symmetric, but other than that the amount of work to do is pretty much the same either way. My conclusion is that it isn't worth compromising the ease of hand computation for the sake of possibly making a too-low-quality-checksum-that-no-one-should-be-using slightly more convenient, but I thought I should mention it at least. On Wed, Feb 22, 2023 at 10:30 PM Russell O'Connor <roconnor@blockstream.com> wrote: > After some consultation, I now see that generators for all degree 2 BCH > codes, such as ours, are smooth and factor into quadratic and linear > components. > > Anyhow the upshot of all this is that you can perform a "quickcheck" > verification of the codex32 strings for whatever size of verification you > want to do, 1 character, 2 characters, 3 characters, upto the full 13 > characters. Each of these partial verifications will have BCH properties. > A 1 character quickchecksum will guarantee to detect any 1 character > error. A 3 character quickchecksum will guarantee to detect any 2 > character error, etc. There remains a 1 in 32^n chance of failing to > detect larger numbers of errors where n is the size of your quickcheck. > > To illustrate, let's consider a quickcheck of size 2. This can detect any > 1 character error and will only have a 1/1024 chance of failing to detect > other random errors. Let's take the second test vector as our example: " > MS12NAMEA320ZYXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM" > > You start in a specified initial state with a pair of bech32 characters. > For MS1 strings and a size 2 quickcheck it would be the pair of Bech32 > characters 'AS'. > > Next we "add" the first character after the prefix, which is '2' by using > the addition volvelle or lookup table. "Adding" '2' to 'S' yields '6' and > our state becomes 'A6'. > > Next we have to look up 'A6' in a lookup table. This table is too big to > fit in the margin of this email, so I will have to omit it. But it would > have an entry mapping 'A6' -> 'QM'. Our state becomes 'QM' > > From this point we have an even number of remaining characters in the > input string and we can work 2 characters at a time. We "add" the next two > data characters from our string, which are 'NA'. Again, using the volvelle > or lookup table we get that adding 'N' to 'Q' yields 'N', and adding 'A' to > 'M' yields 'X'. So our state is now 'NX' > > Next we look up 'NX' in this table I haven't given you and we will find an > entry mapping 'NX' -> 'DX', making 'DX' our new state. > > We keep repeating this process alternating between adding pairs of > characters and using this unstated lookup table all the way until the end > where we will reach a final state which will be 'H9'. > > If you follow this procedure with any string (upto 400 bit master seeds) > you will always end up in the state 'H9'. > > A specialized worksheet would help guide the process making the process > easier to follow. > > > This process is somewhat close to Peter Todd's suggestion of using a > lookup table on "words", which in this case would be pairs of bech32 > characters, and adding values together. The catch is that the addition is > done with Bech32 addition rather than calculator addition, which I accept > is a moderately large catch. > > Anyhow, the point is that you can do this sort of partial verification by > hand to whatever degree you like, if you are in a rush and are willing to > accept larger chances of failing to catch random errors. > > > On Wed, Feb 22, 2023 at 2:01 PM Russell O'Connor <roconnor@blockstream.com> > wrote: > >> After some poking around at the math, I do see that the 13 character >> generator (for regular sized shares) is reasonably "smooth", having roots >> at T{11}, S{16}, and C{24}. >> >> This means we could build a "quick check" worksheet to evaluate the >> string modulo (x - T) to verify a 5 bit checksum, whose operation would be >> similar to the existing checksum worksheet in structure but significantly >> less work. >> >> Perhaps 5 bits is too short, and it is more reasonable working modulo (x >> - T)*(x - S) to get a 10 bit checksum. A worksheet for a 15 bit checksum >> is also an option, and possibly others well depending on the size of the >> other factors. I think this process is would be about as simple as any >> other comparable hand operated checksum over the bech32 alphabet would be. >> >> I haven't looked into the smoothness of the long generator for seeds that >> are greater than 400 bits. If it isn't smooth and smoother options are >> available, perhaps it should be changed. >> >> On Wed, Feb 22, 2023 at 11:29 AM Peter Todd via bitcoin-dev < >> bitcoin-dev@lists.linuxfoundation.org> wrote: >> >>> On Sun, Feb 19, 2023 at 10:12:51PM +0000, Andrew Poelstra via >>> bitcoin-dev wrote: >>> > > What really did catch my attention, but which was kind of buried in >>> the >>> > > project documentation, is the ability to verify the integrity of each >>> > > share independently without using a computer. For example, if I >>> store a >>> > > share with some relative who lives thousands of kilometers away, >>> I'll be >>> > > able to take that share out of its tamper-evident bag on my annual >>> > > holiday visit, verify that I can still read it accurately by >>> validating >>> > > its checksum, and put it into a new bag for another year. For this >>> > > procedure, I don't need to bring copies of any of my other shares, >>> > > allowing them (and my seed) to stay safe. >>> > > >>> > >>> > This is good feedback. I strongly agree that this is the big selling >>> > point for this -- that you can vet shares/seeds which *aren't* being >>> > actively used, without exposing them to the sorts of threats associated >>> > with active use. >>> > >>> > We should make this more prominent in the BIP motivation. >>> >>> I don't think that use-case is a good selling point. The checksum that >>> Codex32 >>> uses is much more complex than necessary if you are simply verifying a >>> share by >>> itself. >>> >>> A *much* simpler approach would be to use a simple mod N = 0 checksum, >>> either >>> by creating the seed such that each share passes, or by just storing an >>> additional word/symbol with the seed in such a way that sum(words) mod N >>> = 0 >>> passes. This approach is not only possible to compute by hand with a >>> word/symbol->number lookup table, and pen and paper or a calculator. >>> It's so >>> simple they could probably *remember* how to do it themselves. >>> >>> >>> Secondly, if all shares have mod N checksums, it may be sufficient for >>> everyone >>> to write down the checksums of the *other* shares, to verify they are the >>> correct ones and a different (otherwise correct) share hasn't >>> accidentally been >>> substituted. >>> >>> Indeed, with some brute forcing and small checksums, I'd expect it to be >>> mathematically possible to generate Shamir's secret sharing shards such >>> that >>> every shard can share the *same* checksum. In which case the share >>> verification >>> procedure would be to simply ask every share holder to verify the >>> checksum >>> manually using the mod N procedure, and then verify that each share >>> holder has >>> the same checksum. This would be less error prone in terms of leaking >>> information accidentally if the checksum was obviously *not* part of the >>> share: >>> eg by encoding the share with words, and the checksum with a number. >>> >>> Obviously, small checksums aren't fool proof. But we're probably better >>> off >>> creating a relatively easy procedure with a 1-in-1000 chance of an error >>> going >>> undetected than a complex procedure that people don't actually do at all. >>> >>> -- >>> https://petertodd.org 'peter'[:-1]@petertodd.org >>> _______________________________________________ >>> bitcoin-dev mailing list >>> bitcoin-dev@lists.linuxfoundation.org >>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >>> >> [-- Attachment #2: Type: text/html, Size: 12511 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-23 3:30 ` Russell O'Connor 2023-02-23 16:36 ` Russell O'Connor @ 2023-02-23 18:26 ` Russell O'Connor 1 sibling, 0 replies; 15+ messages in thread From: Russell O'Connor @ 2023-02-23 18:26 UTC (permalink / raw) To: Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 9867 bytes --] One more thing (Again apologies. This idea of doing partial verification is novel to me, and I see now that I should have just waited to give a consolidated reply). Focusing in on the example of performing 2 character quick checks. There are 7 different ways of building the table used in this quick check verification process (actually there are 8, but we only need 7 of them for our purposes here). Fun fact: If you perform the 2 character quick check in all 7 different ways, this is equivalent to doing a full checksum verification. This suggests a strategy of visiting your shares on a regular basis and performing a different 2 character quick check each time, rotating through the 7 different ways of performing it. Moreover, these 7 different 2 character quick checks come with a prescribed order that will accumulate BCH guarantees as you go along. Assuming the string isn't changing between visits then * After the 1st table you are guaranteed to detect any 1 character error. * After the 2nd table you are guaranteed to detect any 2 character error. * After the 3rd table you are guaranteed to detect any 4 character error. * After the 4th table you are guaranteed to detect any 5 character error. * After the 5th table you are guaranteed to detect any 6 character error. * After the 6th table you are guaranteed to detect any 7 character error. * After the 7th table you are guaranteed to detect any 8 character error, which is the guarantee of the full 13 character checksum. You are also guaranteed that the full 13 character checksum is now correct. You could perform the checks out of order, and that is okay. You will eventually reach these BCH levels of guarantees, just not as quickly as if you follow the prescribed order. Of course, doing a series of 7 different 2 character quick checks is overall more work than doing the full 13 character checksum validation. But there is certainly an advantage in spreading the work out over time. Each time you visit you still have the guarantee of catching any new 1 character error introduced since the last time you visited and a 99.9% chance of catching any other random errors introduced since your last visit. Personally I am likely to follow such a validation strategy myself, now that I am aware of it. On Wed, Feb 22, 2023 at 10:30 PM Russell O'Connor <roconnor@blockstream.com> wrote: > After some consultation, I now see that generators for all degree 2 BCH > codes, such as ours, are smooth and factor into quadratic and linear > components. > > Anyhow the upshot of all this is that you can perform a "quickcheck" > verification of the codex32 strings for whatever size of verification you > want to do, 1 character, 2 characters, 3 characters, upto the full 13 > characters. Each of these partial verifications will have BCH properties. > A 1 character quickchecksum will guarantee to detect any 1 character > error. A 3 character quickchecksum will guarantee to detect any 2 > character error, etc. There remains a 1 in 32^n chance of failing to > detect larger numbers of errors where n is the size of your quickcheck. > > To illustrate, let's consider a quickcheck of size 2. This can detect any > 1 character error and will only have a 1/1024 chance of failing to detect > other random errors. Let's take the second test vector as our example: " > MS12NAMEA320ZYXWVUTSRQPNMLKJHGFEDCAXRPP870HKKQRM" > > You start in a specified initial state with a pair of bech32 characters. > For MS1 strings and a size 2 quickcheck it would be the pair of Bech32 > characters 'AS'. > > Next we "add" the first character after the prefix, which is '2' by using > the addition volvelle or lookup table. "Adding" '2' to 'S' yields '6' and > our state becomes 'A6'. > > Next we have to look up 'A6' in a lookup table. This table is too big to > fit in the margin of this email, so I will have to omit it. But it would > have an entry mapping 'A6' -> 'QM'. Our state becomes 'QM' > > From this point we have an even number of remaining characters in the > input string and we can work 2 characters at a time. We "add" the next two > data characters from our string, which are 'NA'. Again, using the volvelle > or lookup table we get that adding 'N' to 'Q' yields 'N', and adding 'A' to > 'M' yields 'X'. So our state is now 'NX' > > Next we look up 'NX' in this table I haven't given you and we will find an > entry mapping 'NX' -> 'DX', making 'DX' our new state. > > We keep repeating this process alternating between adding pairs of > characters and using this unstated lookup table all the way until the end > where we will reach a final state which will be 'H9'. > > If you follow this procedure with any string (upto 400 bit master seeds) > you will always end up in the state 'H9'. > > A specialized worksheet would help guide the process making the process > easier to follow. > > > This process is somewhat close to Peter Todd's suggestion of using a > lookup table on "words", which in this case would be pairs of bech32 > characters, and adding values together. The catch is that the addition is > done with Bech32 addition rather than calculator addition, which I accept > is a moderately large catch. > > Anyhow, the point is that you can do this sort of partial verification by > hand to whatever degree you like, if you are in a rush and are willing to > accept larger chances of failing to catch random errors. > > > On Wed, Feb 22, 2023 at 2:01 PM Russell O'Connor <roconnor@blockstream.com> > wrote: > >> After some poking around at the math, I do see that the 13 character >> generator (for regular sized shares) is reasonably "smooth", having roots >> at T{11}, S{16}, and C{24}. >> >> This means we could build a "quick check" worksheet to evaluate the >> string modulo (x - T) to verify a 5 bit checksum, whose operation would be >> similar to the existing checksum worksheet in structure but significantly >> less work. >> >> Perhaps 5 bits is too short, and it is more reasonable working modulo (x >> - T)*(x - S) to get a 10 bit checksum. A worksheet for a 15 bit checksum >> is also an option, and possibly others well depending on the size of the >> other factors. I think this process is would be about as simple as any >> other comparable hand operated checksum over the bech32 alphabet would be. >> >> I haven't looked into the smoothness of the long generator for seeds that >> are greater than 400 bits. If it isn't smooth and smoother options are >> available, perhaps it should be changed. >> >> On Wed, Feb 22, 2023 at 11:29 AM Peter Todd via bitcoin-dev < >> bitcoin-dev@lists.linuxfoundation.org> wrote: >> >>> On Sun, Feb 19, 2023 at 10:12:51PM +0000, Andrew Poelstra via >>> bitcoin-dev wrote: >>> > > What really did catch my attention, but which was kind of buried in >>> the >>> > > project documentation, is the ability to verify the integrity of each >>> > > share independently without using a computer. For example, if I >>> store a >>> > > share with some relative who lives thousands of kilometers away, >>> I'll be >>> > > able to take that share out of its tamper-evident bag on my annual >>> > > holiday visit, verify that I can still read it accurately by >>> validating >>> > > its checksum, and put it into a new bag for another year. For this >>> > > procedure, I don't need to bring copies of any of my other shares, >>> > > allowing them (and my seed) to stay safe. >>> > > >>> > >>> > This is good feedback. I strongly agree that this is the big selling >>> > point for this -- that you can vet shares/seeds which *aren't* being >>> > actively used, without exposing them to the sorts of threats associated >>> > with active use. >>> > >>> > We should make this more prominent in the BIP motivation. >>> >>> I don't think that use-case is a good selling point. The checksum that >>> Codex32 >>> uses is much more complex than necessary if you are simply verifying a >>> share by >>> itself. >>> >>> A *much* simpler approach would be to use a simple mod N = 0 checksum, >>> either >>> by creating the seed such that each share passes, or by just storing an >>> additional word/symbol with the seed in such a way that sum(words) mod N >>> = 0 >>> passes. This approach is not only possible to compute by hand with a >>> word/symbol->number lookup table, and pen and paper or a calculator. >>> It's so >>> simple they could probably *remember* how to do it themselves. >>> >>> >>> Secondly, if all shares have mod N checksums, it may be sufficient for >>> everyone >>> to write down the checksums of the *other* shares, to verify they are the >>> correct ones and a different (otherwise correct) share hasn't >>> accidentally been >>> substituted. >>> >>> Indeed, with some brute forcing and small checksums, I'd expect it to be >>> mathematically possible to generate Shamir's secret sharing shards such >>> that >>> every shard can share the *same* checksum. In which case the share >>> verification >>> procedure would be to simply ask every share holder to verify the >>> checksum >>> manually using the mod N procedure, and then verify that each share >>> holder has >>> the same checksum. This would be less error prone in terms of leaking >>> information accidentally if the checksum was obviously *not* part of the >>> share: >>> eg by encoding the share with words, and the checksum with a number. >>> >>> Obviously, small checksums aren't fool proof. But we're probably better >>> off >>> creating a relatively easy procedure with a 1-in-1000 chance of an error >>> going >>> undetected than a complex procedure that people don't actually do at all. >>> >>> -- >>> https://petertodd.org 'peter'[:-1]@petertodd.org >>> _______________________________________________ >>> bitcoin-dev mailing list >>> bitcoin-dev@lists.linuxfoundation.org >>> https://lists.linuxfoundation.org/mailman/listinfo/bitcoin-dev >>> >> [-- Attachment #2: Type: text/html, Size: 12197 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-19 20:13 ` David A. Harding 2023-02-19 22:12 ` Andrew Poelstra @ 2023-02-22 17:24 ` Russell O'Connor 1 sibling, 0 replies; 15+ messages in thread From: Russell O'Connor @ 2023-02-22 17:24 UTC (permalink / raw) To: David A. Harding, Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 6926 bytes --] On Sun, Feb 19, 2023 at 3:13 PM David A. Harding via bitcoin-dev < bitcoin-dev@lists.linuxfoundation.org> wrote: > Codex32 allows the individual to periodically perform their > recollection on paper in a private room without electronics and use > nothing but a pen and some loookup tables (or a paper device) to > verify that they recalled the string correctly (and its checksum can > help with correcting up to several errors, although you might need a > computer for error location and correction assistance). > While perhaps not entirely impossible, doing error correction by hand is far more difficult than just the error detection process. Fortunately, error correction can be aided by a computer needing very little trust. When doing hand computation of the checksum, there is an error if the final "residue" is different from the value given in the codex32 spec. After double checking your hand computation by redoing it, if you still get the same incorrect residue, there is an error in your share data somewhere. What you can do is plug in the residue that you did compute into a computer, and have it produce an error correction string. You can then, by hand, add this error correction string to your share to get a corrected share. If it were the case that all types of character errors were equally likely (as in during an error any character is equally likely to be transformed into any other character), then the computer would gain zero information about your actual share data. Of course, it is not in fact the case that all transcription errors are equally likely, and so the computer can learn a little bit about your share. The fewer errors that are made, the less data it can recover. If you only have one character in error, then 5 bits is the most data it can recover, and that is assuming that it can somehow infer your error perfectly from the delta of the error correction, which isn't actually going to be the case. Of course, a single share from a split secret has no information about your master seed (the master seed is hidden in the correlation between different shares). So learning partial information about one share isn't going to be enough by itself to even begin compromising your master key. This all still needs to be written up in more detail, but I figured I would give a preview here. - Hierarchy: Codex32 does not natively provide support for nested SSSS > whereas SLIP39 does. E.g., in SLIP39, you can require 2-of-3 for > {me, family, friends} where me is 2-of-3 {fire_safe, bank_safe, > buried_in_woods}, family is 1-of-3 {alice, bob, carol}, and friends > are 2-of-5 {d, e, f, g, h}. I assume you can do the same with Codex32 > by using the share for one level as the secret for the next level, > although this is not described in the protocol. > There was a design for a second level share scheme floating around somewhere. I'll have to dig it up. As I recall this is made slightly more complicated by needing to incorporate share metadata (i.e. the share index) when doing a second split, but it seemed doable at the time. > - Versioning: Codex32's metadata can store version information for > wallets that use implicit BIP32 paths (e.g. BIP44/49/84/86), although > this would cut into the space available for users to set their own > metadata and it is not specified in the draft BIP. SLIP39 also > doesn't specify anything about implicit path versioning and, AFAICT, > doesn't have any room to store such metadata without reducing seed > entropy. > Personally, I don't consider the derivation path / descriptor data as that sensitive, and I would recommend making wide backups of that data. It certainly would make sense to store descriptor data alongside wherever you keep your shares, and more places beyond that. On the other hand, if you are trying to keep your shares innocuous somehow, perhaps you won't be able to keep the descriptor data alongside your shares. When I first saw the post about this, it was unclear to me that it was a > serious project, but I've become increasingly interested as I researched > it. I'm not personally that interested in generating entropy from dice > or encoding shares by hand---it's already imperative that I acquire a > trustworthy computer and load it with trustworthy software in order to > use my seed securely, so I might as well have it generate my seeds and > my > recovery codes for me. > I do think hardware wallets are great, and overall provide a lot of practical protection. What is notable is that once the secrets are loaded onto a hardware wallet, as long as that wallet remains isolated, it cannot leak any secrets. Of course, a wallet needs to interact with the Bitcoin protocol and P2P network, at least indirectly, in order to function, so we must break that isolation. However, if we can limit the communication capabilities of a hardware wallet, even a malicious wallet shouldn't be able to leak the secret data. There are three main vectors a hardware wallet can try to communicate that I am aware of: 1. Compromise at seed/master secret (or in this case shares of the master secret) during generation time. 2. Lie about public keys at public key generation time. 3. Exfiltrate data though signature data or otherwise during signature generation time. #3 is largely mitigated through using anti-exfil signing and air gapping the hardware wallet (e.g. using QR codes, or using RS-232 <https://en.wikipedia.org/wiki/RS-232> if you consider that air gaping). Using multiple wallets from different vendors doing deterministic signing is another possibility, but I consider the method deprecated in favour of anti-exfil. Addressing #1 is where codex32 lies, by taking master secret handling functions out of the hardware wallet. My understanding is that it is difficult for digital hardware, which tries very hard to be deterministic, to generate randomness, especially for isolated hardware devices like hardware wallets. Also it is hard for hardware to tell if their hardware random number generator is broken. Generally I don't trust small pieces of isolated digital hardware to generate master seeds, even when they are not malicious. This may just be due to my ignorance of how they operate. #2 seems to be the hardest issue to address. My current thinking is to generate addresses on a diverse set of hardware and/or using public derivations. Perhaps an alternative to BIP-32 could be designed to make it easy to generate zero-knowledge proofs of pubkey derivations. Of course there are other ways for a hardware wallets to exfiltrate data: The wallet can blink lights, make noise, or it can try to use radio broadcasts, etc. These other attack vectors seem to require physically local support, which is a fairly big hurdle to overcome. I suppose even these vectors could be mitigated through various levels of tinfoil. [-- Attachment #2: Type: text/html, Size: 8412 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-16 2:16 [bitcoin-dev] Codex32 Russell O'Connor 2023-02-16 11:50 ` Pavol Rusnak @ 2023-02-20 18:44 ` Andrew Poelstra 2023-02-20 18:48 ` Pavol Rusnak 1 sibling, 1 reply; 15+ messages in thread From: Andrew Poelstra @ 2023-02-20 18:44 UTC (permalink / raw) To: Russell O'Connor, Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 1340 bytes --] On Wed, Feb 15, 2023 at 09:16:02PM -0500, Russell O'Connor via bitcoin-dev wrote: > I've been asked by Dr. Curr and Professor Snead to forward this message to > this mailing list, as it may be of general interest to Bitcoin users. > > <snip> > I have opened a PR to the BIPs repo for this scheme: https://github.com/bitcoin/bips/pull/1425 Thanks very much to Pavol Rusnak, David Harding, and Christopher Allen for their comments on this list. We've updated the draft text to try to address your concerns. In particular: * Added more text to "Motivation" distinguishing the scheme from SLIP-39 * Added more details to "Rationale" about error correction capabilities of the code, and to explain that the code does not defend against malicious errors * Added a note to use uppercase for QR codes * Rearranged and clarified the "creating shares" instructions * Added text about hand-computation, in particular hand-computation of share verification, to "Motivation". If any of you would like to be listed under Acknowledgements, please let me know here or on the PR. Cheers Andrew -- Andrew Poelstra Director of Research, Blockstream Email: apoelstra at wpsoftware.net Web: https://www.wpsoftware.net/andrew The sun is always shining in space -Justin Lewis-Webster [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [bitcoin-dev] Codex32 2023-02-20 18:44 ` Andrew Poelstra @ 2023-02-20 18:48 ` Pavol Rusnak 0 siblings, 0 replies; 15+ messages in thread From: Pavol Rusnak @ 2023-02-20 18:48 UTC (permalink / raw) To: Andrew Poelstra, Bitcoin Protocol Discussion [-- Attachment #1: Type: text/plain, Size: 194 bytes --] Thanks Andrew! New draft makes it much more obvious what are the differences between Codex32 and SLIP-0039 scheme. -- Best Regards / S pozdravom, Pavol "Stick" Rusnak Co-Founder, SatoshiLabs [-- Attachment #2: Type: text/html, Size: 462 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2023-02-23 18:26 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-02-16 2:16 [bitcoin-dev] Codex32 Russell O'Connor 2023-02-16 11:50 ` Pavol Rusnak 2023-02-16 13:49 ` Andrew Poelstra 2023-02-19 20:13 ` David A. Harding 2023-02-19 22:12 ` Andrew Poelstra 2023-02-19 23:05 ` Christopher Allen 2023-02-20 0:52 ` Russell O'Connor 2023-02-22 16:29 ` Peter Todd 2023-02-22 19:01 ` Russell O'Connor 2023-02-23 3:30 ` Russell O'Connor 2023-02-23 16:36 ` Russell O'Connor 2023-02-23 18:26 ` Russell O'Connor 2023-02-22 17:24 ` Russell O'Connor 2023-02-20 18:44 ` Andrew Poelstra 2023-02-20 18:48 ` Pavol Rusnak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox