* [bitcoin-dev] BIP proposal - addrv2 message
@ 2019-02-18 7:56 Wladimir J. van der Laan
2019-03-06 3:02 ` Gregory Maxwell
2019-03-06 9:05 ` Sjors Provoost
0 siblings, 2 replies; 3+ messages in thread
From: Wladimir J. van der Laan @ 2019-02-18 7:56 UTC (permalink / raw)
To: bitcoin-dev
See https://gist.github.com/laanwj/4fe8470881d7b9499eedc48dc9ef1ad1 for formatted version,
Look under "Considerations" for topics that might still need to be discussed.
<pre>
BIP: ???
Layer: Peer Services
Title: addrv2 message
Author: Wladimir J. van der Laan <laanwj@gmail.com>
Comments-Summary: No comments yet.
Comments-URI:
Status: Draft
Type: Standards Track
Created: 2018-06-01
License: BSD-2-Clause
</pre>
==Introduction==
===Abstract===
This document proposes a new P2P message to gossip longer node addresses over the P2P network.
This is required to support new-generation Onion addresses, I2P, and potentially other networks
that have longer endpoint addresses than fit in the 128 bits of the current <code>addr</code> message.
===Copyright===
This BIP is licensed under the 2-clause BSD license.
===Motivation===
Tor v3 hidden services are part of the stable release of Tor since version 0.3.2.9. They have
various advantages compared to the old hidden services, among which better encryption and privacy
<ref>[https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt Tor Rendezvous Specification - Version 3]</ref>.
These services have 256 bit addresses and thus do not fit in the existing <code>addr</code> message, which encapsulates onion addresses in OnionCat IPv6 addresses.
Other transport-layer protocols such as I2P have always used longer
addresses. This change would make it possible to gossip such addresses over the
P2P network, so that other peers can connect to them.
==Specification==
<blockquote>
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
interpreted as described in RFC 2119<ref>[https://tools.ietf.org/html/rfc2119 RFC 2119]</ref>.
</blockquote>
The <code>addrv2</code> message is defined as a message where <code>pchCommand == "addrv2"</code>.
It is serialized in the standard encoding for P2P messages.
Its format is similar to the current <code>addr</code> message format
<ref>[https://bitcoin.org/en/developer-reference#addr Bitcoin Developer Reference: addr message]</ref>, with the difference that the
fixed 16-byte IP address is replaced by a network ID and a variable-length address, and the time and services format has been changed to VARINT.
This means that the message contains a serialized <code>std::vector</code> of the following structure:
{| class="wikitable" style="width: auto; text-align: center; font-size: smaller; table-layout: fixed;"
!Type
!Name
!Description
|-
| <code>VARINT</code> (unsigned)
| <code>time</code>
| Time that this node was last seen as connected to the network. A time in Unix epoch time format, up to 64 bits wide.
|-
| <code>VARINT</code> (unsigned)
| <code>services</code>
| Service bits. A 64-wide bit field.
|-
| <code>uint8_t</code>
| <code>networkID</code>
| Network identifier. An 8-bit value that specifies which network is addressed.
|-
| <code>std::vector<uint8_t></code>
| <code>addr</code>
| Network address. The interpretation depends on networkID.
|-
| <code>uint16_t</code>
| <code>port</code>
| Network port. If not relevant for the network this MUST be 0.
|}
One message can contain up to 1,000 addresses. Clients SHOULD reject messages with more addresses.
Field <code>addr</code> has a variable length, with a maximum of 32 bytes (256 bits). Clients SHOULD reject
longer addresses.
The list of reserved network IDs is as follows:
{| class="wikitable" style="width: auto; text-align: center; font-size: smaller; table-layout: fixed;"
!Network ID
!Enumeration
!Address length (bytes)
!Description
|-
| <code>0x01</code>
| <code>IPV4</code>
| 4
| IPv4 address (globally routed internet)
|-
| <code>0x02</code>
| <code>IPV6</code>
| 16
| IPv6 address (globally routed internet)
|-
| <code>0x03</code>
| <code>TORV2</code>
| 10
| Tor v2 hidden service address
|-
| <code>0x04</code>
| <code>TORV3</code>
| 32
| Tor v3 hidden service address
|-
| <code>0x05</code>
| <code>I2P</code>
| 32
| I2P overlay network address
|-
| <code>0x06</code>
| <code>CJDNS</code>
| 16
| Cjdns overlay network address
|}
To allow for future extensibility, clients MUST ignore address types that they do not know about.
Client MAY store and gossip address formats that they do not know about. Further network ID numbers MUST be reserved in a new BIP document.
Clients SHOULD reject addresses that have a different length than specified in this table for a specific address ID, as these are meaningless.
See the appendices for the address encodings to be used for the various networks.
==Compatibility==
Send <code>addrv2</code> messages only, and exclusively, when the peer has a certain protocol version (or higher):
<source lang="c++">
//! gossiping using `addrv2` messages starts with this version
static const int GOSSIP_ADDRV2_VERSION = 70016;
</source>
For older peers keep sending the legacy <code>addr</code> message, ignoring addresses with the newly introduced address types.
==Reference implementation==
The reference implementation is available at (to be done)
==Considerations==
(to be discussed)
* ''Client MAY store and gossip address formats that they do not know about'': does it ever make sense to gossip addresses outside a certain overlay network? Say, I2P addresses to Tor? I'm not sure. Especially for networks that have no exit nodes as there is no overlap with the globally routed internet at all.
* Lower precision of <code>time</code> field? seconds precision seems overkill, and can even be harmful, there have been attacks that exploited high precision timestamps for mapping the current network topology.
** (gmaxwell) If you care about space time field could be reduced to 16 bits easily. Turn it into a "time ago seen" quantized to 1 hour precision. (IIRC we quantize times to 2hrs regardless).
* Rolling <code>port</code> into <code>addr</code>, or making the port optional, would make it possible to shave off two bytes for address types that don't have ports (however, all of the currently listed formats have a concept of port.). It could also be an optional data item (see below).
* (gmaxwell) Optional (per-service) data could be useful for various things:
** Node-flavors for striping (signalling which slice of the blocks the node has in selective pruning)
** Payload for is alternative ports for other transports (e.g. UDP ports)
** If we want optional flags. I guess the best thing would just be a byte to include the count of them, then a byte "type" for each one where the type also encodes if the payload is 0/8/16/32 bits. (using the two MSB of the type to encode the length). And then bound the count of them so that the total is still reasonably sized.
==Acknowledgements==
- Jonas Schnelli: change <code>services</code> field to VARINT, to make the message more compact in the likely case instead of always using 8 bytes.
- Luke-Jr: change <code>time</code> field to VARINT, for post-2038 compatibility.
- Gregory Maxwell: various suggestions regarding extensibility
==Appendix A: Tor v2 address encoding==
The new message introduces a separate network ID for <code>TORV2</code>.
Clients MUST send Tor hidden service addresses with this network ID, with the 80-bit hidden service ID in the address field. This is the same as the representation in the legacy <code>addr</code> message, minus the 6 byte prefix of the OnionCat wrapping.
Clients SHOULD ignore OnionCat (<code>fd87:d87e:eb43::/48</code>) addresses on receive if they come with the <code>IPV6</code> network ID.
==Appendix B: Tor v3 address encoding==
According to the spec <ref>[https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt Tor Rendezvous Specification - Version 3: Encoding onion addresses]</ref>, next-gen <code>.onion</code> addresses are encoded as follows:
<pre>
onion_address = base32(PUBKEY | CHECKSUM | VERSION) + ".onion"
CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2]
where:
- PUBKEY is the 32 bytes ed25519 master pubkey of the hidden service.
- VERSION is an one byte version field (default value '\x03')
- ".onion checksum" is a constant string
- CHECKSUM is truncated to two bytes before inserting it in onion_address
</pre>
Tor v3 addresses MUST be sent with the <code>TORV3</code> network ID, with the 32-byte PUBKEY part in the address field. As VERSION will always be '\x03' in the case of v3 addresses, this is enough to reconstruct the onion address.
==Appendix C: I2P address encoding==
Like Tor, I2P naming uses a base32-encoded address format<ref>[https://geti2p.net/en/docs/naming#base32 I2P: Naming and address book]</ref>.
I2P uses 52 characters (256 bits) to represent the full SHA-256 hash, followed by <code>.b32.i2p</code>.
I2P addresses MUST be sent with the <code>I2P</code> network ID, with the decoded SHA-256 hash as address field.
==Appendix D: Cjdns address encoding==
Cjdns addresses are simply IPv6 addresses in the <code>fc00::/8</code> range<ref>[https://github.com/cjdelisle/cjdns/blob/6e46fa41f5647d6b414612d9d63626b0b952746b/doc/Whitepaper.md#pulling-it-all-together Cjdns whitepaper: Pulling It All Together]</ref>. They MUST be sent with the <code>CJDNS</code> network ID.
==References==
<references/>
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [bitcoin-dev] BIP proposal - addrv2 message
2019-02-18 7:56 [bitcoin-dev] BIP proposal - addrv2 message Wladimir J. van der Laan
@ 2019-03-06 3:02 ` Gregory Maxwell
2019-03-06 9:05 ` Sjors Provoost
1 sibling, 0 replies; 3+ messages in thread
From: Gregory Maxwell @ 2019-03-06 3:02 UTC (permalink / raw)
To: Wladimir J. van der Laan, Bitcoin Protocol Discussion
On Wed, Mar 6, 2019 at 12:22 AM Wladimir J. van der Laan via
bitcoin-dev <bitcoin-dev@lists.linuxfoundation.org> wrote:
> Field <code>addr</code> has a variable length, with a maximum of 32 bytes (256 bits). Clients SHOULD reject
> longer addresses.
Is 32 bytes long enough for I2P? It seems like there are two formats,
is there a reason we might want to use the longer one?
https://geti2p.net/en/docs/naming
Probably the spec should define the limit per address type (e.g.
sending a 32 byte IPv4 makes no sense). And either a maximum for ANY
type (so that 1000*largest size is reasonable), or a maximum size for
the message (e.g. regardless of the included size, an add message
should never be over, say 100k).
> * ''Client MAY store and gossip address formats that they do not know about'': does it ever make sense to gossip addresses outside a certain overlay network? Say, I2P addresses to Tor? I'm not sure. Especially for networks that have no exit nodes as there is no overlap with the globally routed internet at all.
I think clients should be discouraged from gossiping stuff they cannot
test but not forbidden from doing so. Separately, they should be
strongly discouraged from gossiping types they don't understand at
all. We don't really want to see people doing file xfer over invalid
addr types. :)
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [bitcoin-dev] BIP proposal - addrv2 message
2019-02-18 7:56 [bitcoin-dev] BIP proposal - addrv2 message Wladimir J. van der Laan
2019-03-06 3:02 ` Gregory Maxwell
@ 2019-03-06 9:05 ` Sjors Provoost
1 sibling, 0 replies; 3+ messages in thread
From: Sjors Provoost @ 2019-03-06 9:05 UTC (permalink / raw)
To: Bitcoin Protocol Discussion, Wladimir J. van der Laan
Concept ACK.
> ==Considerations==
>
> (to be discussed)
>
> * ''Client MAY store and gossip address formats that they do not know about'': does it ever make sense to gossip addresses outside a certain overlay network? Say, I2P addresses to Tor? I'm not sure. Especially for networks that have no exit nodes as there is no overlap with the globally routed internet at all.
What exactly do you mean by "do not know about"? It could mean:
1. A new Network ID was recently introduced which an older node doesn't about.
In that case the node won't even know the address length, so it can't parse the entry.
In fact it can't parse the entire address message if a single address has an unknown format. Maybe require a single address type per ADDR2 message?
2. The Network ID doesn't match the network the node received this message on
The node should probably be agnostic about where it received this information from.
3. The node currently doesn't support a Network ID
But what does that mean? No connection? An explicitly disabled setting? A missing dependency? The operating system doesn't support it?
I think "MAY" is the correct choice for storing for (2).
For (3) I think it makes sense for nodes to store information even if they're disconnected, but not if they have a setting disabled or no driver. Though that implementation detail doesn't seem relevant to the standard.
I don't think it's a good idea to gossip information you can't at least in theory verify, but we already do that with Tor V2. It's useful to gossip information about other networks to help e.g. IPv4 nodes bootstrap Tor connections. On the other hand, that could also help an attacker link them. We could recommend that with addrv2 the node should make sure gossip messages were received on the correct interface, but that may not be practical.
> * Lower precision of <code>time</code> field? seconds precision seems overkill, and can even be harmful, there have been attacks that exploited high precision timestamps for mapping the current network topology.
>
> ** (gmaxwell) If you care about space time field could be reduced to 16 bits easily. Turn it into a "time ago seen" quantized to 1 hour precision. (IIRC we quantize times to 2hrs regardless).
That seems like a good idea.
> * (gmaxwell) Optional (per-service) data could be useful for various things:
> [...]
> ** If we want optional flags. I guess the best thing would just be a byte to include the count of them, then a byte "type" for each one where the type also encodes if the payload is 0/8/16/32 bits. (using the two MSB of the type to encode the length). And then bound the count of them so that the total is still reasonably sized.
Adding more information seems useful, though also creates more topology mapping opportunities.
- Sjors
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-03-06 9:15 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-18 7:56 [bitcoin-dev] BIP proposal - addrv2 message Wladimir J. van der Laan
2019-03-06 3:02 ` Gregory Maxwell
2019-03-06 9:05 ` Sjors Provoost
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox