Ctorrent ：关于NAT 的一些思考

最新推荐文章于 2019-09-16 11:34:55 发布

aobai219

最新推荐文章于 2019-09-16 11:34:55 发布

阅读量2.5k

点赞数

分类专栏： p2p 文章标签：服务器 dictionary download socket string integer

本文链接：https://blog.csdn.net/aobai219/article/details/4297821

版权

p2p 专栏收录该内容

8 篇文章 0 订阅

订阅专栏

最近有点忙，有空再整理一下，如果能够结合源码说说自己的心得就很好了。

总之，个人认为在现实中实现p2p 的tcp方式的打洞是非常困难的，如果想简单的再握手的过程中就实现了打洞几乎不可能。但是有tcp relay等 udp的打洞方式只是看了网上的一个源码分析。基本的明白。可惜自己没有看过用udp实现的p2p 源码

TCP 方式下的NAT 穿越和UDP方式下的穿越还是有不同的地方的。

图没有上传，麻烦

附录的内容都是我认为比较好的关于NAT的资料感谢原作者

Ctorrent ：关于 NAT 的一些思考

长久以来，都在思考 ctorrent 中 NAT 是如何实现的。借鉴了网上的一些分析 NAT 穿越的文章，基本弄明白了 NAT 穿越的的原理。那么在 ctorrent 中 NAT 穿越是如何实现的呢？

在 BT 的官方协议中，我们知道 peer 和 tracker 交互是通过 http 协议，但是在协议中并没有相关的 NAT 穿越的内容（一般我们认为要实行穿越要 tracker 给 peer 发送打洞消息（可能是 udp 等））。显然在 Ctorrent 中不是这样实现的。

以下内容主要描述一个 client 怎么从 tracker 端得到要请求的 peer 的 ip 地址，怎么将这些 ip 地址添加到 peerlist 列表中。目的就是为了说明他是怎么实现我们认为的 NAT 穿越

直接用源码分析。

Client: 代表客户端自己

第一步： client 向 tracker 提交请求， tracker 回复请求，内容包括请求的其它 peer 的 ip 地址， client 将 ip 地址添加到一个列表中。

在函数 int btTracker::_UpdatePeerList(char *buf,size_t bufsiz) 中

pos = decode_query(buf, bufsiz, "peers", (const char**)0, (size_t *)0, (int64_t*)0, QUERY_POS); if( !pos ){ return -1; } if(4 > bufsiz - pos){ return -1; } // peers list 太小 buf += (pos + 1); bufsiz -= (pos + 1); ps = buf-1; if( *ps != 'l' ){ // binary peers section if not 'l' addr.sin_family = AF_INET; i = 0; while( *ps != ':' ) i = i * 10 + (*ps++ - '0'); i /= 6; ps++; while( i-- > 0 ){ memcpy(&addr.sin_addr,ps,sizeof(struct in_addr)); memcpy(&addr.sin_port,ps+sizeof(struct in_addr),sizeof(unsigned short)); if( !Self.IpEquiv(addr) ){ cnt++; IPQUEUE.Add(&addr); } ps += 6; } } 第二步：在函数 int PeerList::IntervalCheck(fd_set *rfdp, fd_set *wfdp) 中实现在下面源码中，我们注意 NewPeer 函数的实现，其他的都不言而喻了 // No pause check here--stay ready by continuing to acquire peers. if( !Tracker.IsQuitting() ){ struct sockaddr_in addr; //#define NEED_MORE_PEERS() (m_peers_count < cfg_max_peers) for( ; NEED_MORE_PEERS() && !IPQUEUE.IsEmpty(); ){ if(IPQUEUE.Pop(&addr) < 0) break; if(NewPeer(addr,INVALID_SOCKET) == -4) break; } } 第三步： NewPeer(addr,INVALID_SOCKET) 这个函数的功能是给 peerlist 添加一个 peer 成员，有两种情况需要添加，一种是上面第二步所示，第二种是我们通过监听 client 的 listing 端口看是否有外来的连接，当有外来的连接，我们调用 Accept 函数，生成一个 socket ，然后添加到 peerlist 上面这两种情况是通过 NEWPeer 函数的最后一个参数来区别的 if( INVALID_SOCKET == sk ){ if( INVALID_SOCKET == (sk = socket(AF_INET,SOCK_STREAM,0)) ) return -1; if( setfd_nonblock(sk) < 0) goto err; if( -1 == (r = connect_nonb (sk,(struct sockaddr*)&addr)) ){ if(arg_verbose) CONSOLE.Debug("Connect to peer at %s:%hu failed: %s", inet_ntoa(addr.sin_addr), ntohs(addr.sin_port), strerror(errno)); return -1; } peer = new btPeer; #ifndef WINDOWS if( !peer ) goto err; #endif peer->SetConnect(); peer->SetAddress(addr); peer->stream.SetSocket(sk); peer->SetStatus( (-2 == r) ? P_CONNECTING : P_HANDSHAKE ); if(arg_verbose) CONSOLE.Debug("Connecting to %s:%hu (peer %p)", inet_ntoa(addr.sin_addr), ntohs(addr.sin_port), peer); } 这里通过 connect_nob client 连接了 peer ，，如果连接失败，返回 -1 ，我们在连接其他的 peer ，如果连接成功，生成一个新的 peer 结构体，对它赋值。不管 connect_nob 结果如何， client 都通过这个函数发送了一个到其它 peer 的消息。如果 client 在 NAT 下面，刚好就打通了到其他 peer 的洞。以上就是我认为的 NAT 穿越在 ctorrent 中的实现。它和在网上看见的 NAT 穿越的实现最大的不同是：它不是通过 tracker 给对端 peer 发送一个给谁打洞消息，而是在尝试连接通过 tracker 发来的 peer 时刚好实现了。

附上 NAT 穿越的原理和 BT 协议中 tracker 和 client 交互协议

Tracker HTTP/HTTPS Protocol

The tracker is an HTTP/HTTPS service which responds to HTTP GET requests. The requests include metrics from clients that help the tracker keep overall statistics about the torrent. The response includes a peer list that helps the client participate in the torrent. The base URL consists of the "announce URL" as defined in the metadata (.torrent) file. The parameters are then added to this URL, using standard CGI methods (i.e. a '?' after the announce URL, followed by 'param=value' sequences separated by '&').

Note that all binary data in the URL (particularly info_hash and peer_id) must be properly escaped. This means any byte not in the set 0-9, a-z, A-Z, '.', '-', '_' and '~', must be encoded using the "%nn" format, where nn is the hexadecimal value of the byte. (See RFC1738 for details.)

For a 20-byte hash of /x12/x34/x56/x78/x9a/xbc/xde/xf1/x23/x45/x67/x89/xab/xcd/xef/x12/x34/x56/x78/x9a,
The right encoded form is %124Vx%9A%BC%DE%F1%23Eg%89%AB%CD%EF%124Vx%9A

Tracker Request Parameters

The parameters used in the client->tracker GET request are as follows:

info_hash : urlencoded 20-byte SHA1 hash of the value of the info key from the Metainfo file. Note that the value will be a bencoded dictionary, given the definition of the info key above.
peer_id : urlencoded 20-byte string used as a unique ID for the client, generated by the client at startup. This is allowed to be any value, and may be binary data. There are currently no guidelines for generating this peer ID. However, one may rightly presume that it must at least be unique for your local machine, thus should probably incorporate things like process ID and perhaps a timestamp recorded at startup. See peer_id below for common client encodings of this field.
port : The port number that the client is listening on. Ports reserved for BitTorrent are typically 6881-6889. Clients may choose to give up if it cannot establish a port within this range.
uploaded : The total amount uploaded (since the client sent the 'started' event to the tracker) in base ten ASCII. While not explicitly stated in the official specification, the concensus is that this should be the total number of bytes uploaded.
downloaded : The total amount downloaded (since the client sent the 'started' event to the tracker) in base ten ASCII. While not explicitly stated in the official specification, the consensus is that this should be the total number of bytes downloaded.
left : The number of bytes this client still has to download, encoded in base ten ASCII.
compact : Setting this to 1 indicates that the client accepts a compact response. The peers list is replaced by a peers string with 6 bytes per peer. The first four bytes are the host (in network byte order), the last two bytes are the port (again in network byte order). It should be noted that some trackers only support compact responses (for saving bandwidth) and either refuse requests without "compact=1" or simply send a compact response unless the request contains "compact=0" (in which case they will refuse the request.)
no_peer_id : Indicates that the tracker can omit peer id field in peers dictionary. This option is ignored if compact is enabled.
event : If specified, must be one of started , completed , stopped , (or empty which is the same as not being specified). If not specified, then this request is one performed at regular intervals.
- started : The first request to the tracker must include the event key with this value.
- stopped : Must be sent to the tracker if the client is shutting down gracefully.
- completed : Must be sent to the tracker when the download completes. However, must not be sent if the download was already 100% complete when the client started. Presumably, this is to allow the tracker to increment the "completed downloads" metric based solely on this event.
ip : Optional. The true IP address of the client machine, in dotted quad format or rfc3513 defined hexed IPv6 address. Notes: In general this parameter is not necessary as the address of the client can be determined from the IP address from which the HTTP request came. The parameter is only needed in the case where the IP address that the request came in on is not the IP address of the client. This happens if the client is communicating to the tracker through a proxy (or a transparent web proxy/cache.) It also is necessary when both the client and the tracker are on the same local side of a NAT gateway. The reason for this is that otherwise the tracker would give out the internal (RFC1918) address of the client, which is not routable. Therefore the client must explicitly state its (external, routable) IP address to be given out to external peers. Various trackers treat this parameter differently. Some only honor it only if the IP address that the request came in on is in RFC1918 space. Others honor it unconditionally, while others ignore it completely. In case of IPv6 address (e.g.: 2001:db8:1:2::100) it indicates only that client can communicate via IPv6.
numwant : Optional. Number of peers that the client would like to receive from the tracker. This value is permitted to be zero. If omitted, typically defaults to 50 peers.
key : Optional. An additional identification that is not shared with any users. It is intended to allow a client to prove their identity should their IP address change.
trackerid : Optional. If a previous announce contained a tracker id, it should be set here.

Tracker Response

The tracker responds with "text/plain" document consisting of a bencoded dictionary with the following keys:

failure reason : If present, then no other keys may be present. The value is a human-readable error message as to why the request failed (string).
warning message : (new, optional) Similar to failure reason, but the response still gets processed normally. The warning message is shown just like an error.
interval : Interval in seconds that the client should wait between sending regular requests to the tracker
min interval : (optional) Minimum announce interval. If present clients must not reannounce more frequently than this.
tracker id : A string that the client should send back on its next announcements. If absent and a previous announce sent a tracker id, do not discard the old value; keep using it.
complete : number of peers with the entire file, i.e. seeders (integer)
incomplete : number of non-seeder peers, aka "leechers" (integer)
peers : (dictionary model) The value is a list of dictionaries, each with the following keys:
- peer id : peer's self-selected ID, as described above for the tracker request (string)
- ip : peer's IP address either IPv6 (hexed) or IPv4 (dotted quad) or DNS name (string)
- port : peer's port number (integer)
peers : (binary model) Instead of using the dictionary model described above, the peers value may be a string consisting of multiples of 6 bytes. First 4 bytes are the IP address and last 2 bytes are the port number. All in network (big endian) notation.

As mentioned above, the list of peers is length 50 by default. If there are fewer peers in the torrent, then the list will be smaller. Otherwise, the tracker randomly selects peers to include in the response. The tracker may choose to implement a more intelligent mechanism for peer selection when responding to a request. For instance, reporting seeds to other seeders could be avoided.

Clients may send a request to the tracker more often than the specified interval, if an event occurs (i.e. stopped or completed) or if the client needs to learn about more peers. However, it is considered bad practice to "hammer" on a tracker to get multiple peers. If a client wants a large peer list in the response, then it should specify the numwant parameter.

Implementer's Note : Even 30 peers is plenty , the official client version 3 in fact only actively forms new connections if it has less than 30 peers and will refuse connections if it has 55. This value is important to performance . When a new piece has completed download, HAVE messages (see below) will need to be sent to most active peers. As a result the cost of broadcast traffic grows in direct proportion to the number of peers. Above 25, new peers are highly unlikely to increase download speed. UI designers are strongly advised to make this obscure and hard to change as it is very rare to be useful to do so.

对穿越 NAT 做些总结：

先做个约定 ：

内网 A 中有： A1 （ 192.168.0.8 ）、 A2 （ 192.168.0.9 ）两用户，

网关 X1 （一个 NAT 设备）有公网 IP 1.2.3 .4

内网 B 中有： B1 （ 192.168.1.8 ）、 B2 （ 192.168.1.9 ）两用户，

网关 Y1 （一个 NAT 设备）有公网 IP 1.2.3 .5

公网服务器： C (6.7.8.9) D (6.7.8.10)

NAT 两大类：

l NAT(Network Address Translators) ：称为基本的 NAT

在客户机时

192.168.0.8:4000 —— 6.7.8 .9:8000

在网关时

1.2.3 .4:4000 —— 6.7.8.9:8000

服务器 C

6.7.8 .9:8000

其核心是替换 IP 地址而不是端口，这会导致 192.168.0.8 使用 4000 端口后， 192.168.0.9 如何处理？

具体参考 RFC 1631

基本上这种类型的 NAT 设备已经很少了。或许根本我们就没机会见到。

l NAPT(Network Address/Port Translators)

其实这种才是我们常说的 NAT

NAPT 的特点是在网关时，会使用网关的 IP ，但端口会选择一个和临时会话对应的临时端口。

如下图：

在客户机时

192.168.0.8:4000 —— 6.7.8 .9:8000

在网关时

1.2.3 .4:62000 —— 6.7.8.9:8000

服务器 C

6.7.8 .9:8000

网关上建立保持了一个 1.2.3 .4:62000 的会话，用于 192.168.0.8:4000 与 6.7.8.9:8000 之间的通讯。

对于 NAPT ，又分了两个大的类型：

差别在于，当两个内网用户同时与 6.7.8 .9:8000 的处理方式不同：

1 、 Symmetric NAT 型 ( 对称型 )

在客户机时

192.168.0.8:4000 —— 6.7.8 .9:8000 192.168.0.8:4000 —— 6.7.8.10:8000

在网关时，两个不同 session 但端口号不同

1.2.3 .4:62000 —— 6.7.8.9:8000 1.2.3.4:62001 —— 6.7.8.10:8000

服务器 C

6.7.8 .9:8000

服务器 D

6.7.8 .10:8000

这种形式会让很多 p2p 软件失灵。

2 、 Cone NAT 型（圆锥型）

在客户机时

192.168.0.8:4000 —— 6.7.8 .9:8000 192.168.0.8:4000 —— 6.7.8.10:8000

在网关时，两个不同 session 但端口号相同

1.2.3 .4:62000 —— 6.7.8.9:8000 1.2.3.4:62000 —— 6.7.8.10:8000

服务器 C

6.7.8 .9:8000

服务器 D

6.7.8 .10:8000

目前绝大多数属于这种。 Cone NAT 又分了 3 种类型：

a) Full Cone NAT （完全圆锥型） ：从同一私网地址端口 192.168.0.8:4000 发至公网的所有请求都映射成同一个公网地址端口 1.2.3 .4:62000 ， 192.168.0.8 可以收到任意外部主机发到 1.2.3.4:62000 的数据报。

b) Address Restricted Cone NAT （地址限制圆锥型） ：从同一私网地址端口 192.168.0.8:4000 发至公网的所有请求都映射成同一个公网地址端口 1.2.3 .4:62000 ，只有当内部主机 192.168.0.8 先给服务器 C 6.7.8.9 发送一个数据报后， 192.168.0.8 才能收到 6.7.8.9 发送到 1.2.3.4:62000 的数据报。

c) Port Restricted Cone NAT （端口限制圆锥型） ：从同一私网地址端口 192.168.0.8:4000 发至公网的所有请求都映射成同一个公网地址端口 1.2.3 .4:62000 ，只有当内部主机 192.168.0.8 先向外部主机地址端口 6.7.8.9 ： 8000 发送一个数据报后， 192.168.0.8 才能收到 6.7.8.9 ： 8000 发送到 1.2.3.4:62000 的数据报。

请注意上述描叙中的区别！

穿越 NAT 的实现：

A1 在客户机时

192.168.0.8:4000 —— 6.7.8 .9:8000

X1 在网关时

1.2.3 .4:62000 —— 6.7.8.9:8000

服务器 C

6.7.8 .9:8000

B1 在客户机时

192.168.1.8:4000 —— 6.7.8 .9:8000

Y1 在网关时

1.2.3 .5:31000 —— 6.7.8.9:8000

两内网用户要实现通过各自网关的直接呼叫，需要以下过程：

1 、客户机 A1 、 B1 顺利通过格子网关访问服务器 C ，均没有问题（类似于登录）

2 、服务器 C 保存了 A1 、 B1 各自在其网关的信息（ 1.2.3 .4:62000 、 1.2.3.5:31000 ）没有问题。并可将该信息告知 A1 、 B2 。

3 、此时 A1 发送给 B1 网关的 1.2.3 .5:31000 是否会被 B1 收到？答案是基本上不行（除非 Y1 设置为完全圆锥型 ，但这种设置非常少），因为 Y1 上检测到其存活的会话中没有一个的目的 IP 或端口于 1.2.3.4:62000 有关而将数据包全部丢弃！

4 、此时要实现 A1 、 B1 通过 X1 、 Y1 来互访，需要服务器 C 告诉它们各自在自己的网关上建立 “ UDP 隧道 ”，即命令 A1 发送一个 192.168.0.8:4000 —— 1.2.3 .5:31000 的数据报， B1 发送一个 192.168.1.8:4000 —— 1.2.3.4:62000 的数据报， UDP 形式，这样 X1 、 Y1 上均存在了 IP 端口相同的两个不同会话（很显然，这要求网关为 Cone NAT 型，否则，对称型 Symmetric NAT 设置网关将导致对不同会话开启了不同端口，而该端口无法为服务器和对方所知，也就没有意义）。

5 、此时 A1 发给 Y1 ，或者 B1 发给 X1 的数据报将不会被丢弃且正确的被对方收到

综合 P2P 可实现的条件需要：

1 、中间服务器保存信息、并能发出建立 UDP 隧道的命令

2 、网关均要求为 Cone NAT 类型。 Symmetric NAT 不适合。

3 、完全圆锥型网关可以无需建立 udp 隧道，但这种情况非常少，要求双方均为这种类型网关的更少。

4 、假如 X1 网关为 Symmetric NAT ， Y1 为 Address Restricted Cone NAT 或 Full Cone NAT 型网关，各自建立隧道后， A1 可通过 X1 发送数据报给 Y1 到 B1( 因为 Y1 最多只进行 IP 级别的甄别 ) ，但 B2 发送给 X1 的将会被丢弃（因为发送来的数据报中端口与 X1 上存在会话的端口不一致，虽然 IP 地址一致），所以同样没有什么意义。

5 、假如双方均为 Symmetric NAT 的情形，新开了端口，对方可以在不知道的情况下尝试猜解，也可以达到目的，但这种情形成功率很低，且带来额外的系统开支，不是个好的解决办法。

6 、不同网关型设置的差异在于，对内会采用替换 IP 的方式、使用不同端口不同会话的方式，使用相同端口不同会话的方式；对外会采用什么都不限制、限制 IP 地址、限制 IP 地址及端口。

7 、这里还没有考虑同一内网不同用户同时访问同一服务器的情形，如果此时网关采用 Address Restricted Cone NAT 或 Full Cone NAT 型，有可能导致不同用户客户端可收到别人的数据包，这显然是不合适的。

一些现在常用的技术：

ALG （应用层网关） ：它可以是一个设备或插件，用于支持SIP 协议，主要类似与在网关上专门开辟一个通道，用于建立内网与外网的连接，也就是说，这是一种定制的网关。更多只适用于使用他们的应用群体内部之间。

UpnP ：它是让网关设备在进行工作时寻找一个全球共享的可路由 IP 来作为通道，这样避免端口造成的影响。要求设备支持且开启 upnp 功能，但大部分时候，这些功能处于安全考虑，是被关闭的。即时开启，实际应用效果还没经过测试。

STUN （ Simple Traversalof UDP Through Network ）：这种方式即是类似于我们上面举例中服务器 C 的处理方式。也是目前普遍采用的方式。但具体实现要比我们描述的复杂许多，光是做网关 Nat 类型判断就由许多工作， RFC3489 中详细描述了。

TURN(Traveral Using Relay NAT) ：该方式是将所有的数据交换都经由服务器来完成，这样 NAT 将没有障碍，但服务器的负载、丢包、延迟性就是很大的问题。目前很多游戏均采用该方式避开 NAT 的问题。这种方式不叫 p2p 。

ICE(Interactive Connectivity Establishment) ：是对上述各种技术的综合，但明显带来了复杂性。

总之， NAT 的存在代表着一种时尚，那就是——不求简单，但求复杂，坚决把你搞晕，反正没我责任。

http://blog.csdn.net/bluniu/archive/2007/09/19/1790913.aspx

aobai219

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Ctorrent ：关于NAT 的一些思考

最近有点忙，有空再整理一下，如果能够结合源码说说自己的心得就很好了。总之，个人认为在现实中实现p2p 的tcp方式的打洞是非常困难的，如果想简单的再握手的过程中就实现了打洞几乎不可能。但是有tcp relay等 udp的打洞方式只是看了网上的一个源码分析。基本的明白。可惜自己没有看过用udp实现的p2p 源码 TCP 方式下的NAT 穿越和UDP
复制链接

扫一扫