Use 33-byte compressed keys instead of 32-byte x-only keys, except when creating the outputs
Negate the private key if necessary when spending taproot outputs
If the receiving wallet uses labels, add a step to also check the negated output
Add a versioning scheme for silent payment addresses

For those interested, the implementation for Bitcoin Core has been updated to reflect these changes here: https://github.com/bitcoin/bitcoin/pull/27827 . For convenience, and in the hopes of soliciting another round of review, the full specification is posted below and the full text of the BIP including the overview, test vectors and appendix on light client support can be found here: https://github.com/bitcoin/bips/pull/1458).

Cheers!

-- Ruben, Josie

== Specification ==

We use the following functions and conventions:

outpoint (36 bytes): the COutPoint of an input (32-byte txid, least significant byte first || 4-byte vout, least significant byte first)[6]
sortoutpoints(v): sorts a vector v of outpoints in ascending order by doing a byte by byte comparison lexicographically.
ser32(i): serializes a 32-bit unsigned integer i as a 4-byte sequence, most significant byte first.
ser256(p): serializes the integer p as a 32-byte sequence, most significant byte first.
serP(P): serializes the coordinate pair P = (x,y) as a byte sequence using SEC1's compressed form: (0x02 or 0x03) || ser256(x), where the header byte depends on the parity of the omitted y coordinate.

For everything not defined above, we use the notation from BIP340.

=== Versions ===

This document defines Silent Payments v0. Version is communicated through the address in the same way as Segwit addresses. Future upgrades to silent payments will require a new version. As much as possible, future upgrades should support receiving from older wallets (e.g. a silent payments v0 wallet can send to both v0 and v1 addresses). Any changes that break compatibility with older silent payment versions should be a new BIP.

Future Silent Payments versions will use the following versioning scheme:

	0	1	2	3	4	5	6	7	Compatibility
+0	q	p	z	r	y	9	x	8	backwards compatible
+8	g	f	2	t	v	d	w	0
+16	s	3	j	n	5	4	k	h
+24	c	e	6	m	u	a	7	-

v31 (l) is reserved for a backwards incompatible change, if needed. For Silent Payments v0:

If the receiver's silent payment address version is:
- v0: check that the data part is exactly 66-bytes. Otherwise, fail
- v1 through v30: read the first 66-bytes of the data part and discard the remaining bytes (if any)
- v31: fail
Receiver addresses are always BIP341 taproot outputs[7]
The sender should sign with one of the sighash flags DEFAULT, ALL, SINGLE, NONE (ANYONECANPAY is unsafe). It is strongly recommended implementations use SIGHASH_DEFAULT when applicable, or SIGHASH_ALL[8]
Inputs used to derive the shared secret are from the Inputs For Shared Secret Derivation list

=== Scanning transactions ===

A transaction is a Silent Payments v0 transaction and MUST be scanned if and only if all of the following are true:

The transaction contains at least one BIP341 taproot output
The transaction has at least one input from the Inputs For Shared Secret Derivation list
The transaction does not spend a new, undefined output type (e.g. SegWit versions > 1)[9]

Otherwise, skip the transaction. This is to ensure forward compatibility with future versions of silent payments without requiring future versions to scan a transaction multiple times with different rule sets.

=== Address encoding ===

A silent payment address is constructed in the following manner:

Let Bscan, bscan = Receiver's scan public key and corresponding private key
Let Bspend, bspend = Receiver's spend public key and corresponding private key
Let Bm = Bspend + m·G, where m an optional integer tweak for labeling
- In the case of m = 0, no label is applied and Bm = Bspend
The final address is a Bech32m encoding of:
- The human-readable part "sp" for mainnet, "tsp" for testnets (e.g. signet, testnet)
- The data-part values:
  - The character "q", to represent a silent payment address of version 0
  - The 66 byte concatenation of the receiver's public keys, serP(Bscan) || serP(Bm)

Note: BIP173 imposes a 90 character limit for Bech32 strings, whereas a silent payment address requires at least 117 characters[10]. Additionally, since higher versions may add to the data field, it is recommended implementations use a limit of 1023 characters (see BIP173: Checksum design for more details).

=== Outpoints hash ===

The sender and receiver MUST calculate an outpoints hash for the transaction in the following manner:

Collect each outpoint used as an input to the transaction
Sort the outpoints with sortoutpoints(outpoints)[11]
Let outpoints = outpoint0 || … || outpointn
Let outpoints_hash = sha256(outpoints)

=== Inputs For Shared Secret Derivation ===

While any UTXO with known output scripts can be used to fund the transaction, the sender and receiver MUST use inputs from the following list when deriving the shared secret:

P2TR
P2WPKH
P2SH-P2WPKH
P2PKH

Inputs with conditional branches or multiple public keys (e.g. CHECKMULTISIG) are not included as this introduces malleability and would allow a sender to re-sign with a different set of public keys after the silent payment output has been derived. This is not a concern when the sender controls all of the inputs, but is an issue for CoinJoins and other collaborative protocols, where a malicious participant can participate in deriving the silent payment address with one set of keys and then re-broadcast the transaction with signatures for a different set of public keys. P2TR can have hidden conditional branches (script path), but we work around this as described below.

==== P2TR ====

The sender MUST use the private key corresponding to the taproot output key (i.e. the tweaked private key for a key path spend). This can be a single private key or an aggregate key (e.g. taproot outputs using MuSig2 or FROST)[12]. If this key is not available, the output cannot be included as an input to the transaction. The receiver always uses the taproot output key when scanning, regardless of whether the taproot output is using a key path spend or a script path spend[13].

The one exception is script path spends that use NUMS point H as their internal key (where H = lift_x(0x50929b74c1a04954b78b4b6035e97a5e078a5a0f28ec96d547bfee9ace803ac0) which is constructed by taking the hash of the standard uncompressed encoding of the secp256k1 base point G as X coordinate, see BIP341: Constructing and spending Taproot outputs for more details), in which case the output will be skipped for the purposes of shared secret derivation[14].

==== P2WPKH ====

The sender performs the tweak using the private key for the output and the receiver obtains the public key from the witness.

==== P2SH-P2WPKH ====

The sender performs the tweak using the private key for the nested P2WPKH output and the receiver obtains the public key from the witness.

==== P2PKH ====

The sender performs the tweak using the private key for the output and SHOULD sign using the standard script template:

    scriptSig: <Signature> <Public Key>

The receiver obtains the public key from the scriptSig. The receiver MUST parse the scriptSig for the public key, even if the scriptSig is non-standard (e.g. <dummy> OP_DROP <Signature> <Public Key>). This is to address the third-party malleability of P2PKH scriptSigs.

=== Sender ===

==== Selecting inputs ====

The sending wallet performs coin selection as usual with the following restrictions:

At least one input MUST be from the Inputs For Shared Secret Derivation list
Exclude inputs with witness version > 1 (see Scanning transactions)
For each taproot output spent the sending wallet MUST have access to the private key corresponding to the taproot output key, unless H is used as the internal public key

==== Creating outputs ====

After the inputs have been selected, the sender can create one or more outputs for one or more silent payment addresses in the following manner:

Collect the private keys for each input from the Inputs For Shared Secret Derivation list
For each private key ai corresponding to a BIP341 taproot output, check that the private key produces a point with an even y-value and negate the private key if not[15]
Let a = a0 + a1 + … an, where each ai has been negated if necessary
Generate the outpoints_hash, using the method described above
Group receiver silent payment addresses by Bscan (e.g. each group consists of one Bscan and one or more Bm)
For each group:
- Let ecdh_shared_secret = outpoints_hash·a·Bscan
- Let n = 0
- For each Bm in the group:
  - Let tn = sha256(serP(ecdh_shared_secret) || ser32(n))
  - Let Pmn = Bm + tn·G
  - Encode Pmn as a BIP341 taproot output
  - Optionally, repeat with n++ to create additional outputs for the current Bm
  - If no additional outputs are required, continue to the next Bm with n++[16]
- Optionally, if the sending wallet implements receiving silent payments, it can create change outputs in the following manner:
  - Let Achange = Aspend + sha256(ser256(ascan))·G
  - Let change_shared_secret = outpoints_hash·a·Ascan
  - Let n = 0
  - For each change output desired:
    - Let cn = sha256(serP(change_shared_secret) || ser32(n))
    - Let Cn = Achange + cn·G
    - Encode Cn as a BIP341 taproot output
    - Repeat with n++ for additional change outputs

=== Receiver ===

==== Key Derivation ====

Two keys are needed to create a silent payments address: the spend key and the scan key. While these keys can be generated independently, wallet software SHOULD use BIP32 derivation[17] to ensure compatibility across wallets.

A scan and spend key pair using BIP32 derivation are defined (taking inspiration from BIP44) in the following manner:

     Scan private key: m / purpose' / coin_type' / account' / 1' / 0
    Spend private key: m / purpose' / coin_type' / account' / 0' / 0

Wallet software MUST use hardened derivation to ensure the master key is not exposed in the event the scan private key is compromised. Purpose is a constant set to 352 following the BIP43 recommendation. Refer to BIP43 and BIP44 for more details.

==== Scanning ====

If each of the checks in Scanning transactions passes, the receiving wallet must:

Generate the outpoints_hash, using the method described above
Let A = A0 + A1 + … An, where each Ai is the public key of an input from the Inputs For Shared Secret Derivation list
Let ecdh_shared_secret = outpoints_hash·bscan·A
Check for outputs:
- Let outputs_to_check = the taproot output key from each unspent taproot output in the transaction
- Starting with n = 0:
  - Let tn = sha256(serP(ecdh_shared_secret) || ser32(n))
  - Compute Pn = Bspend + tn·G
  - For each output in outputs_to_check:
    - If Pn equals output:
      - Add Pn to the wallet
      - Remove output from outputs_to_check and rescan outputs_to_check with n++
    - Else, if the wallet has precomputed labels (including the change label, if used)[18]:
      - Compute m·G = output - Pn
      - Check if m·G exists in the list of labels used by the wallet
      - If a match is found:
        Add the Pn + m·G to the wallet
        Remove output from outputs_to_check and rescan outputs_to_check with n++
      - If the label is not found, negate output and check again
  - If no matches are found, stop

==== Backup and Recovery ====

Since each silent payment output address is derived independently, regular backups are recommended. When recovering from a backup, the wallet will need to scan since the last backup to detect new payments.

If using a seed/seed phrase only style backup, the user can recover the wallet's unspent outputs from the UTXO set (i.e. only scanning transactions with at least one unspent taproot output) and can recover the full wallet history by scanning the blockchain starting from the wallet birthday. If a wallet uses labels or generates its change addresses using the change label, this information SHOULD be included in the backup. If the user does not know whether labels or the change label were used, it is strongly recommended they always check for the change label when recovering from backup and precompute a large number of labels (e.g. 100k labels) to use when re-scanning. This ensures that the wallet can recover all funds from only a seed/seed phrase backup.

== Backward Compatibility ==

Silent payments introduces a new address format and protocol for sending and as such is not compatible with older wallet software or wallets which have not implemented the silent payments protocol.

== Rationale and References ==

Why not use out-of-band notifications Out of band notifications (e.g. using something other than the Bitcoin blockchain) have been proposed as a way of addressing the privacy and cost concerns of using the Bitcoin blockchain as a messaging layer. This, however, simply moves the privacy and cost concerns somewhere else and increases the risk of losing money due to a notification not being reliably delivered, or even censored, and makes this notification data critical for backup to recover funds.
Why allow for more than one output? Allowing Alice to break her payment to Bob into multiple amounts opens up a number of privacy improving techniques for Alice, making the transaction look like a CoinJoin or better hiding the change amount by splitting both the payment and change outputs into multiple amounts. It also allows for Alice and Carol to both have their own unique output paying Bob in the event they are in a collaborative transaction and both paying Bob's silent payment address.
What about inputs without public keys? Inputs without public keys can still be spent in the transaction but are simply ignored in the Silent Payments protocol.
How does using all inputs help light clients? If Alice uses a random input for the tweak, Bob necessarily has to have access to and check all transaction inputs, which requires performing an ECC multiplication per input. If instead Alice performs the tweak with the sum of the input public keys, Bob only needs the summed 33 byte public key per transaction and only does one ECC multiplication per transaction. Bob can then use BIP158 block filters to determine if any of the outputs exist in a block and thus avoids downloading transactions which don't belong to him. It is still an open question as to how Bob can source the 32 bytes per transaction in a trustless manner, see Appendix A: Light Client Support for more details.
Why does using all inputs matter for CoinJoin? If Alice uses a random input to create the output for Bob, this necessarily reveals to Bob which input Alice has control of. If Alice is paying Bob as part of a CoinJoin, this would reveal which input belongs to her, degrading the anonymity set of the CoinJoin and giving Bob more information about Alice. If instead all inputs are used, Bob has no way of knowing which input(s) belong to Alice. This comes at the cost of increased complexity as the CoinJoin participants now need to coordinate to create the silent payment output and would need to use Blind-Diffie-Hellman to prevent the other participants from learning who Alice is paying.
Why are outpoints little-endian? Despite using big endian throughout the rest of the BIP, outpoints are sorted and hashed matching their transaction serialization, which is little-endian. This allows a wallet to parse a serialized transaction for use in silent payments without needing to re-order the bytes when compute the outpoint hash. Note: despite outpoints being stored and serialized as little-endian, the transaction hash (txid) is always displayed as big-endian.
Why only taproot outputs? Providing too much optionality for the protocol makes it difficult to implement and can be at odds with the goal of providing the best privacy. Limiting to taproot outputs helps simplify the implementation significantly while also putting users in the best eventual anonymity set.
Why recommend SIGHASH_[DEFAULT|ALL]? Since the output address for the receiver is derived from from the sum of the Inputs For Shared Secret Derivation public keys, the inputs must not change once the sender has signed the transaction. If the inputs are allowed to change after the fact, the receiver will not be able to calculate the shared secret needed to find and spend the output. It is currently an open question on how a future version of silent payments could be made to work with new sighash flags such as SIGHASH_GROUP and SIGHASH_ANYPREVOUT.
Why skip transactions that spend unknown output scripts? Skipping transactions that spend unknown output scripts allows us to have a clean upgrade path for Silent Payments by avoiding the need to scan the same transaction multiple times with different rule sets. If a fancy new output type is added in the future and Silent Payments v1 is released with support, we would want to avoid having to first scan the transaction with the silent payment v0 rules and then again with the silent payment v1 rules. Note: this restriction only applies to the inputs of a transaction.
Why do silent payment addresses need at least 117 characters? A silent payment address is a bech32m encoding comprised of the following parts:
- HRP [2-3]
- separator [1]
- version [1-2]
- payload, 66 bytes concatenated pubkeys [ceil(66*8/5)]
- checksum [6]
For a silent payments v0 address, this results in a 117 character address when using a 3 character HRP. Future versions of silent payment addresses may add to the payload, which is why a 1023 character limit is suggested.
Why are outpoints sorted before hashing? This way the silent payment otuput does not need to be recalculated if the wallet changes the order of inputs, e.g. at signing time or during an RBF bump.
Are key aggregation techniques like FROST and MuSig2 supported? Any taproot output able to do a key path spend is supported. While a full specification of how to do this securely is outside the scope of this BIP, in theory any offline key aggregation technique can be used, such as FROST or MuSig2. This would require participants to perform the ECDH step collaboratively e.g. ECDH = a0·Bscan + a1·Bscan + ... + at·Bscan and P = Bspend + hash(outpoints_hash·ECDH || 0)·G. Additionally, it may be necessary for the participants to provide a DLEQ proof to ensure they are not acting maliciously.
Why not skip all taproot script path spends? This causes malleability issues for CoinJoins. If the silent payments protocol skipped taproot script path spends, this would allow an attacker to join a CoinJoin round, participate in deriving the silent payment address using the tweaked private key for a key path spend, and then broadcast their own version of the transaction using the script path spend. If the receiver were to only consider key path spends, they would skip the attacker's script path spend input when deriving the shared secret and not be able to find the funds. Additionally, there may be scenarios where a sender has access to the key path private key but spends the output using the script path.
Why skip outputs with H as the internal taproot key? If use cases get popularized where the taproot key path cannot be used, these outputs can still be included without getting in the way of making a silent payment, provided they specifically use H as their internal taproot key.
Why do taproot private keys need to be checked? Recall from BIP340 that each x-only public key has two corresponding private keys, d and n - d. To maintain parity between sender and receiver, it is necessary to use the private key corresponding to the even y-value when performing the ECDH step since the receiver will assume the even y-value when summing the taproot x-only public keys.
Why not re-use tn? when paying different labels to the same receiver? If paying the same entity but to two separate labeled addresses in the same transaction without incrementing n, the two outputs would be Bspend + tn·G + i·G and Bspend + tn·G + j·G. The attacker could subtract the two values and observe that the distance between i and j is small. This would allow them to deduce that this transaction is a silent payment transaction and that a single entity received two outputs, but won't tell them who the entity is.
Why use BIP32 hardened derivation? Using BIP32 derivation allows users to add silent payments to an existing master seed. It also ensures that a user's silent payment funds are recoverable in any wallet which supports BIP32 derivation. Using hardened derivation ensures that it is safe to export the scan private key without exposing the master key or spend private key.
^Why precompute labels? Precomputing the labels is not strictly necessary: a wallet could track the max number of labels it has used (call it M) and scan for labels by adding m·G to P0 for each label m up to M and comparing to the transaction outputs. This is more performant than precomputing the labels and checking via subtraction in cases where the number of eligible outputs exceeds the number of labels in use. In practice this will mainly apply to users that choose never to use labels, or users that use a single label for generating silent payment change outputs. If using a large number of labels, the wallet would need to add all possible labels to each output. This ends up being n·M additions, where n is the number of outputs in the transaction and M is the number of labels in the wallet. By precomputing the labels, the wallet only needs to compute m·G once when creating the labeled address and can determine if a label was used via a lookup, rather than adding each label to each output.
Data for Appendix A These numbers are based on data from January 2023 until June 2023 (the last 6 months of data at time time of writing). See Silent payments light client data for the full analysis.