The Ethereum devp2p and discv4 protocol Part I

描述

本文章分上下两篇

  • 上篇:讲述以太坊devp2p与disc4节点发现协议
  • 下篇:实践篇,实现如何获取以太坊所有节点信息(ip,port,nodeId,client,type,os)

正文

devp2p是一种应用层网络协议,是在对等网络中节点与节点之间进行通信的重要协议。节点可以支持任意数量的增加,devp2p能处理双方支持的协商过的协议,并通过单个链接传递消息。

节点用过使用RLPx发送消息进行通信。节点可以在他们希望的任何TPC端口上自由地通告和接受连接,可以在其上侦听和简历连接的默认端口为30303.虽然TCP提供面向连接的介质,devp2p节点在数据包方面进行通信。RLPx提供发送和接受数据包的工具。

devp2p节点通过发现协议DHT找到对等体; (下篇便是围绕DHT协议编写应用程序)

消息

使用RLP序列化格式对消息进行编码,可以在消息内编码许多不同类型的有效载荷。该“类型”总是由分组RLP的第一条目确定,被解释为整数。

devp2p通过基本有线协议支持任意子协议。每个协议都根据需要提供尽可能多的消息ID空间。所有这些协议必须今天地指定它们需要多少消息ID。在链接和接受Hello消息时,两个对等体都具有关于它们共享子协议(包括版本)的等效信息,并且能够对消息ID空间的组成形成共识。

假设消息ID从ID 0x10开始是紧凑的(0x00-0x10保留用于devp2p消息)并按字母顺序给予每个共享(等版本,相等名称)子协议。忽略未共享的子协议。如果多个版本由相同(等名)子协议共享,则数字上最高的胜利,其他则被忽略。

p2p Sub-protocol Messages

Hello 0x00 [p2pVersion: P, clientId: B, [[cap1: B_3, capVersion1: P], [cap2: B_3, capVersion2: P], ...], listenPort: P, nodeId: B_64] First packet sent over the connection, and sent once by both sides. No other messages may be sent until a Hello is received.

  • p2pVersion Specifies the implemented version of the P2P protocol. Now must be 1.
  • clientId Specifies the client software identity, as a human-readable string (e.g. "Ethereum(++)/1.0.0").
  • cap Specifies a peer capability name as an ASCII string, e.g. "eth" for - the eth subprotocol.
  • capVersion Specifies a peer capability version as a positive integer. listenPort specifies the port that the client is listening on (on the interface that the present connection traverses). If 0 it indicates the client is not listening.
  • nodeId is the unique identity of the node and specifies a 512-bit secp256k1 public key that identifies this node.

Disconnect 0x01 [reason: P] Inform the peer that a disconnection is imminent; if received, a peer should disconnect immediately. When sending, well-behaved hosts give their peers a fighting chance (read: wait 2 seconds) to disconnect to before disconnecting themselves.

  • reason is an optional integer specifying one of a number of reasons for disconnect:
    • 0x00 Disconnect requested;
    • 0x01 TCP sub-system error;
    • 0x02 Breach of protocol, e.g. a malformed message, bad RLP, incorrect magic number &c.;
    • 0x03 Useless peer;
    • 0x04 Too many peers;
    • 0x05 Already connected;
    • 0x06 Incompatible P2P protocol version;
    • 0x07 Null node identity received - this is automatically invalid;
    • 0x08 Client quitting;
    • 0x09 Unexpected identity (i.e. a different identity to a previous connection/what a trusted peer told us).
    • 0x0a Identity is the same as this node (i.e. connected to itself);
    • 0x0b Timeout on receiving a message (i.e. nothing received since sending last ping);
    • 0x10 Some other reason specific to a subprotocol. Ping 0x02 [] Requests an immediate reply of Pong from the peer.

Pong 0x03 [] Reply to peer's Ping packet.

Node Discovery Protocol v4

This specification defines the Node Discovery protocol version 4, a Kademlia-like DHT that stores information about Ethereum nodes. The Kademlia structure was chosen because it yields a topology of low diameter.

Node Identities

Every node has a cryptographic identity, a key on the elliptic curve secp256k1. The public key of the node serves as its identifier or 'node ID'.

The 'distance' between two node IDs is the bitwise exclusive or on the hashes of the public keys, taken as the number.

distance(n₁, n₂) = keccak256(n₁) XOR keccak256(n₂)

Node Table

Nodes in the Discovery Protocol keep information about other nodes in their neighborhood. Neighbor nodes are stored in a routing table consisting of 'k-buckets'. For each 0 ≤ i < 256, every node keeps a k-bucket for nodes of distance between 2i and 2i+1 from itself.

The Node Discovery Protocol uses k = 16, i.e. every k-bucket contains up to 16 node entries. The entries are sorted by time last seen — least-recently seen node at the head, most-recently seen at the tail.

Whenever a new node N₁ is encountered, it can be inserted into the corresponding bucket. If the bucket contains less than k entries N₁ can simply be added as the first entry. If the bucket already contains k entries, the least recently seen node in the bucket, N₂, needs to be revalidated by sending a ping packet. If no reply is received from N₂ it is considered dead, removed and N₁ added to the front of the bucket.

Endpoint Proof

To prevent traffic amplification attacks, implementations must verify that the sender of a query participates in the discovery protocol. The sender of a packet is considered verified if it has sent a valid pong response with matching ping hash within the last 12 hours.

Recursive Lookup

A 'lookup' locates the k closest nodes to a node ID.

The lookup initiator starts by picking α closest nodes to the target it knows of. The initiator then sends concurrent FindNode packets to those nodes. α is a system-wide concurrency parameter, such as 3. In the recursive step, the initiator resends FindNode to nodes it has learned about from previous queries. Of the k nodes the initiator has heard of closest to the target, it picks α that it has not yet queried and resends FindNode to them. Nodes that fail to respond quickly are removed from consideration until and unless they do respond.

If a round of FindNode queries fails to return a node any closer than the closest already seen, the initiator resends the find node to all of the k closest nodes it has not already queried. The lookup terminates when the initiator has queried and gotten responses from the k closest nodes it has seen.

Wire Protocol

Node discovery messages are sent as UDP datagrams. The maximum size of any packet is 1280 bytes. packet = packet-header || packet-data Every packet starts with a header:

packet-header = hash || signature || packet-type
hash = keccak256(signature || packet-type || packet-data)
signature = sign(packet-type || packet-data)

The hash exists to make the packet format recognizable when running multiple protocols on the same UDP port. It serves no other purpose.

Every packet is signed by the node's identity key. The signature is encoded as a byte array of length 65 as the concatenation of the signature values r, s and the 'recovery id' v.

The packet-type is a single byte defining the type of message. Valid packet types are listed below. Data after the header is specific to the packet type and is encoded as an RLP list. As per EIP-8, implementations should ignore any additional elements in the list as well as any extra data after the list.

Ping Packet (0x01)

packet-data = [version, from, to, expiration] version = 4 from = [sender-ip, sender-udp-port, sender-tcp-port] to = [recipient-ip, recipient-udp-port, 0] The expiration field is an absolute UNIX time stamp. Packets containing a time stamp that lies in the past are expired may not be processed.

When a ping packet is received, the recipient should reply with a pong packet. It may also consider the sender for addition into the node table.

If no communication with the sender has occurred within the last 12h, a ping should be sent in addition to pong in order to receive an endpoint proof.

Pong Packet (0x02)

packet-data = [to, ping-hash, expiration] Pong is the reply to ping.

ping-hash should be equal to hash of the corresponding ping packet. Implementations should ignore unsolicited pong packets that do not contain the hash of the most recent ping packet.

FindNode Packet (0x03)

packet-data = [target, expiration] A FindNode packet requests information about nodes close to target. The target is a 65-byte secp256k1 public key. When FindNode is received, the recipient should reply with neighbors packets containing the closest 16 nodes to target found in its local table.

To guard against traffic amplification attacks, Neighbors replies should only be sent if the sender of FindNode has been verified by the endpoint proof procedure.

Neighbors Packet (0x04)

packet-data = [nodes, expiration] nodes = [[ip, udp-port, tcp-port, node-id], ... ] Neighbors is the reply to FindNode.

posted @ 2019-01-25 19:14  q兽兽  阅读(442)  评论(0编辑  收藏  举报