[搜片神器]之DHT网络爬虫的代码实现方法

 

继续接着第一篇写:使用C#实现DHT磁力搜索的BT种子后端管理程序+数据库设计(开源)[搜片神器]

 谢谢园子朋友的支持,已经找到个VPS进行测试,国外的服务器: http://www.sosobta.com 大家可以给提点意见...

开源地址:https://github.com/h31h31/H31DHTMgr

程序下载:H31DHT下载

 

看大家对昨天此类文章的兴趣没有第一篇高,今天就简单的对支持的朋友进行交流.园子里的朋友希望授大家以渔,所以这部分代码就先不放出来.希望大家更多的加入进来.

也希望谁有能力将C++的代码转换成C#的,添加到我们的搜片神器工具里面.

昨天通过向大家介绍DHT的工作原理,相信大家大概明白怎么回事,不明白的朋友可以继续分享接下来的文章.

 本人借鉴的代码是C++版本的:transmission里面的DHT代码,大家可以访问网站下载:http://www.transmissionbt.com/ 

不过里面的代码环境是LINUX下的,需要自己转换到相应的WIN平台上来.

有兴趣使用C#来完成DHT功能的朋友可以借鉴mono-monotorrent,里面的框架代码比较多,不如C++的transmission里面就三个文件来得明白.

transmission里面只有三个文件就可以实现dht的功能: dht.c dht.h dht-example.c,并且接口很简单,复用性很好。


下面介绍进入DHT网络主要功能步骤

dht.c dht.h代码分成三部分:
1、路由表的插入操作。
1)如果节点已经在路由表中,则更新节点,返回。
2)如果桶没有满,则插入,返回。
3)如果发现失效节点,替换,返回。
4)发现可疑节点,则保存新节点到缓存中并且如果该可疑节点没有ping,发出ping_node操作,返回。
5)现在,桶已经充满了好的节点,如果自己的ID没有落在这个桶中,返回。
6)将桶空间分成两半。跳到步骤1)。

2、KAD远程处理调用。
这部分又分成3种,
1)ping/pong操作。
所有的包的tid都使用pg\0\0
2)find_node操作。
所有的包的tid都使用fn\0\0
3)get_peers/annouce_peer操作。
对同一个HASH的一次递归查询中,tid保持不变。
其中只有3)种实现bittorrent的DHT规范里面提到的递归查询操作,1)和2)仅仅用来维护路由表,并且不保存状态。

3、定时器处理:
为了检测路由表中节点的有效性(根据规范,路由表中应该只保存有效节点),在代码中,在执行krpc操作时如果发现时对路由表中的节点操作,那么则保存操作的开始时间 pinged_time,通过操作的开始时间来判断操作是否超时。

expire_stuff_time 超时时,会执行下面的操作:
1、检查路由表中失效的节点(根据pinged_time来判定),并将该节点删除。
2、检查用来保存annoounce_peer的节点是否超过30分钟(这个不打算深入讨论,故不做解析)。
3、检查递归查询操作超时。

rotate_secrets_time 定时器。
用来每隔大约15分左右就更换token(见DHT规范).

confirm_nodes_time 定时器。
查找长期没有活动的桶,然后通过执行一个find_node的krpc操作来刷新它。

search_time定时器。
有可能出现发出的所有的get_peers操作,都没有应答,那么search_time定时器遇到这种情形时负责重发所有请求。(注意: get_peers操作最大未决的krpc请求数是3)

用于维持路由表的ping/pong操作:
在试图插入节点时,发现桶已经满,而存在可疑节点时会触发ping_node操作。未响应的节点会有可疑最终变为失效节点,而被替换。

下面介绍我们是如何进入DHT网络

  1. DHT必须把自己电脑当服务器,别人才能够知道自己是谁,所以需要通过UDP绑定端口,参考代码里面支持IPV6,个人觉得可以过滤掉.WIN平台代码如下:
     1     //初始化socket
     2     m_soListen =(int)socket(PF_INET, SOCK_DGRAM, IPPROTO_IP);
     3     if (m_soListen == INVALID_SOCKET) {
     4         m_iErrorNo=WSAGetLastError();
     5         _dout(_T("CH31CarMonitorDlg Start Error(%d).\n"),m_iErrorNo);
     6         return -1;
     7     }
     8     //初始化服务器地址
     9     SOCKADDR_IN addr;
    10     memset(&addr, 0, sizeof(addr));
    11     addr.sin_family = AF_INET;
    12     addr.sin_port = htons(port);
    13     addr.sin_addr.s_addr = htonl(INADDR_ANY);
    14     //绑定端口监听
    15     if (bind(m_soListen, (SOCKADDR*)&addr, sizeof(addr)) == SOCKET_ERROR) {
    16         m_iErrorNo=WSAGetLastError();
    17         _dout(_T("CH31CarMonitorDlg Start Error(%d).\n"),m_iErrorNo);
    18         return -2;
    19     }
    UDP端口绑定
  2. DHT需要生成一个自己的20位ID号,当然可以通过随机一个数值,然后通过SHA1来生成20位的ID号,WIN平台代码如下:
    1 unsigned char p[20];
    2 CSHA1 sha1;
    3 sha1.Reset();
    4 sha1.Update((const unsigned char *)m_myID.GetBuffer(),   m_myID.GetLength());
    5 sha1.Final();
    6 sha1.GetHash(p);
    SHA1生成ID号
  3. 初始化他人服务器的IP信息,这样我们就可以从他们那里查询我们要的信息,借鉴代码如下:
     1     rc = getaddrinfo("router.utorrent.com","6881", &hints1, &info);
     2     //rc = getaddrinfo("router.bittorrent.com","6881", &hints1, &info);
     3     //rc = getaddrinfo("dht.transmissionbt.com","6881", &hints1, &info);
     4     if(rc != 0) {
     5         fprintf(stderr, "getaddrinfo: %s\n", gai_strerror(rc));
     6         exit(1);
     7     }
     8     infop = info;
     9     while(infop&&m_bDataThread) 
    10     {
    11         memcpy(&bootstrap_nodes[num_bootstrap_nodes],infop->ai_addr, infop->ai_addrlen);
    12         infop = infop->ai_next;
    13         num_bootstrap_nodes++;
    14     }
    15     freeaddrinfo(info);
    服务器信息
  4. 现在就可以初始化我们的DHT类了.由于此类使用C写的,大家可以自行封装成C++类使用.
    1     rc = m_dht.dht_init(s, s6, m_myid,NULL);
    2     if(rc < 0) {
    3         perror("dht_init");
    4         exit(1);
    5     }
    初始化DHT类
  5. 对服务器进行PING操作,服务器就会回应PONG操作,这样就表明服务器活动正常.
    1     for(int i = 0; i < num_bootstrap_nodes&&m_bDataThread; i++) 
    2     {
    3         m_dht.dht_ping_node((struct sockaddr*)&bootstrap_nodes[i],sizeof(bootstrap_nodes[i]));
    4         Sleep(m_dht.random() % 1000);
    5     }
    PING服务器
  6. 下面就可以使用搜索类进行操作,查询我们要的HASH值的BT种子文件代码.借鉴代码如下:
    1  if(searching) {
    2             if(s >= 0)
    3                 dht_search(hash, 0, AF_INET, callback, NULL);
    4             if(s6 >= 0)
    5                 dht_search(hash, 0, AF_INET6, callback, NULL);
    6             searching = 0;
    7         }
    dht_search
  7. 大家可以借鉴dht-example.c里面接下来的Search函数的操作,不过我们不是这样来的,我们需要直接向服务器发送Findnode和Get_Peer操作.
    1                 unsigned char tid[16];
    2                 m_dht.make_tid(tid, "fn", 0);
    3                 m_dht.send_find_node(&ipRecvPingList[ipListPOS].fromaddr,sizeof(sockaddr),tid,4,ipRecvPingList[ipListPOS].ID,0,0);
    4                 Sleep(100);
    5                 memset(tid,0,sizeof(tid));
    6                 m_dht.make_tid(tid, "gp", 0);
    7                 m_dht.send_get_peers(&ipRecvPingList[ipListPOS].fromaddr,sizeof(sockaddr),tid,4,hashList[0],0,0);
    发送FINDNODE和GET_PEER操作
  8. 接下来的事情就是等待别人返回的信息进行分析就可以了,当然DHT类代码已经全部为我们做好的.
     1         FD_ZERO(&readfds);
     2         if(m_soListen >= 0)
     3             FD_SET(m_soListen, &readfds);
     4         if(s6 >= 0)
     5             FD_SET(s6, &readfds);
     6         rc = select(m_soListen > s6 ? m_soListen + 1 : s6 + 1, &readfds, NULL, NULL, &tv);
     7         if(rc <0&&m_bDataThread) 
     8         {
     9             if(errno != EINTR) {
    10                 perror("select");
    11                 Sleep(1000);
    12             }
    13         }
    14         
    15         if(!m_bDataThread)
    16             break;
    17 
    18         if(rc > 0&&m_bDataThread) 
    19         {
    20             fromlen = sizeof(from1);
    21             memset(buf,0,sizeof(buf));
    22             if(m_soListen >= 0 && FD_ISSET(m_soListen, &readfds))
    23                 rc = recvfrom(m_soListen, buf, sizeof(buf) - 1, 0,&from1, &fromlen);
    24             else if(s6 >= 0 && FD_ISSET(s6, &readfds))
    25                 rc = recvfrom(s6, buf, sizeof(buf) - 1, 0,&from1, &fromlen);
    26             else
    27                 abort();
    28         }
    29 
    30         if(rc > 0&&m_bDataThread) 
    31         {
    32             buf[rc] = '\0';
    33             rc = m_dht.dht_periodic(buf, rc, &from1, fromlen,&tosleep, DHT_callback, this);
    34 
    35         } 
    36         else 
    37         {
    38             rc = m_dht.dht_periodic(NULL, 0, NULL, 0, &tosleep, DHT_callback, this);
    39         }
    等待返回DHT网络信息
  9. 如何解析信息DHT代码已经有了,如何别人的请求,代码也已经有了,大家可以分析DHT.c就知道是怎么回事.
      1 int CDHT::dht_periodic(const void *buf, size_t buflen,const struct sockaddr *fromAddr, int fromlen,time_t *tosleep,dht_callback *callback, void *closure)
      2 {
      3     gettimeofday(&nowTime, NULL);
      4 
      5     if(buflen > 0) 
      6     {
      7         int message;
      8         unsigned char tid[16], id[20], info_hash[20], target[20];
      9         unsigned char nodes[256], nodes6[1024], token[128];
     10         int tid_len = 16, token_len = 128;
     11         int nodes_len = 256, nodes6_len = 1024;
     12         unsigned short port;
     13         unsigned char values[2048], values6[2048];
     14         int values_len = 2048, values6_len = 2048;
     15         int want;
     16         unsigned short ttid;
     17 
     18         struct sockaddr_in* tempip=(struct sockaddr_in *)fromAddr;
     19 
     20         if(is_martian(fromAddr))
     21             goto dontread;
     22 
     23         if(node_blacklisted(fromAddr, fromlen)) {
     24             _dout("Received packet from blacklisted node.\n");
     25             goto dontread;
     26         }
     27 
     28         if(((char*)buf)[buflen] != '\0') {
     29             _dout("Unterminated message.\n");
     30             errno = EINVAL;
     31             return -1;
     32         }
     33 
     34         message = parse_message((unsigned char *)buf, buflen, tid, &tid_len, id, info_hash,target, &port, token, &token_len,nodes, &nodes_len, nodes6, &nodes6_len,values, &values_len, values6, &values6_len,&want);
     35 
     36         if(token_len>0)
     37         {
     38             int a=0;
     39         }
     40         if(message < 0 || message == ERROR || id_cmp(id, zeroes) == 0) 
     41         {
     42             _dout("Unparseable message: ");
     43             debug_printable((const unsigned char *)buf, buflen);
     44             _dout("\n");
     45             goto dontread;
     46         }
     47 
     48         if(id_cmp(id, myid) == 0) {
     49             _dout("Received message from self.\n");
     50             goto dontread;
     51         }
     52 
     53         if(message > REPLY) {
     54             /* Rate limit requests. */
     55             if(!token_bucket()) {
     56                 _dout("Dropping request due to rate limiting.\n");
     57                 goto dontread;
     58             }
     59         }
     60 
     61         switch(message) 
     62         {
     63         case REPLY:
     64             if(tid_len != 4) 
     65             {
     66                 _dout("Broken node truncates transaction ids: ");
     67                 debug_printable((const unsigned char *)buf, buflen);
     68                 _dout("\n");
     69                 /* This is really annoying, as it means that we will
     70                    time-out all our searches that go through this node.
     71                    Kill it. */
     72                 blacklist_node(id, fromAddr, fromlen);
     73                 goto dontread;
     74             }
     75             if(tid_match(tid, "pn", NULL)) 
     76             {
     77                 _dout("Pong!From IP:%s:[%d] id:[%s]\n",inet_ntoa(tempip->sin_addr),tempip->sin_port,id);
     78                 new_node(id, fromAddr, fromlen, 2);
     79                 (*callback)(closure, DHT_EVENT_PONG_VALUES,id,(void*)fromAddr, fromlen);
     80                 //send_find_node(from,fromlen,tid,4,id,0,0);
     81             } 
     82             else if(tid_match(tid, "fn", NULL) ||tid_match(tid, "gp", NULL)) 
     83             {
     84                 int gp = 0;
     85                 struct search *sr = NULL;
     86                 if(tid_match(tid, "gp", &ttid)) 
     87                 {
     88                     gp = 1;
     89                     sr = find_search(ttid, fromAddr->sa_family);
     90                 }
     91                 _dout("Nodes found (%d+%d)%s!From IP:%s:[%d]\n", nodes_len/26, nodes6_len/38,gp ? " for get_peers" : "",inet_ntoa(tempip->sin_addr),tempip->sin_port);
     92                 if(nodes_len % 26 != 0 || nodes6_len % 38 != 0) 
     93                 {
     94                     _dout("Unexpected length for node info!\n");
     95                     blacklist_node(id, fromAddr, fromlen);
     96                 } 
     97                 //else if(gp && sr == NULL) 
     98                 //{
     99     //                _dout("Unknown search!\n");
    100     //                new_node(id, fromAddr, fromlen, 1);
    101     //            } 
    102                 else 
    103                 {
    104                     int i;
    105                     new_node(id, fromAddr, fromlen, 2);
    106                     for(i = 0; i < nodes_len / 26; i++) 
    107                     {
    108                         unsigned char *ni = nodes + i * 26;
    109                         struct sockaddr_in sin;
    110                         if(id_cmp(ni, myid) == 0)
    111                             continue;
    112                         memset(&sin, 0, sizeof(sin));
    113                         sin.sin_family = AF_INET;
    114                         memcpy(&sin.sin_addr, ni + 20, 4);
    115                         memcpy(&sin.sin_port, ni + 24, 2);
    116                         new_node(ni, (struct sockaddr*)&sin, sizeof(sin), 0);
    117                         (*callback)(closure, DHT_EVENT_FINDNODE_VALUES, ni,(void*)&sin, sizeof(sin));
    118                         if(sr && sr->af == AF_INET) 
    119                         {
    120                             insert_search_node(ni,(struct sockaddr*)&sin,sizeof(sin),sr, 0, NULL, 0);
    121                         }
    122                         //send_get_peers((struct sockaddr*)&sin,sizeof(sockaddr),tid,4,ni,0,0);
    123                     }
    124                     for(i = 0; i < nodes6_len / 38; i++) 
    125                     {
    126                         unsigned char *ni = nodes6 + i * 38;
    127                         struct sockaddr_in6 sinip6;
    128                         if(id_cmp(ni, myid) == 0)
    129                             continue;
    130                         memset(&sinip6, 0, sizeof(sinip6));
    131                         sinip6.sin6_family = AF_INET6;
    132                         memcpy(&sinip6.sin6_addr, ni + 20, 16);
    133                         memcpy(&sinip6.sin6_port, ni + 36, 2);
    134                         new_node(ni, (struct sockaddr*)&sinip6, sizeof(sinip6), 0);
    135                         if(sr && sr->af == AF_INET6) 
    136                         {
    137                             insert_search_node(ni,(struct sockaddr*)&sinip6,sizeof(sinip6),sr, 0, NULL, 0);
    138                         }
    139                     }
    140                     if(sr)
    141                         /* Since we received a reply, the number of requests in flight has decreased.  Let's push another request. */
    142                         search_send_get_peers(sr, NULL);
    143                 }
    144                 //if(sr) 
    145                 {
    146                    // insert_search_node(id, fromAddr, fromlen, sr,1, token, token_len);
    147                     if(values_len > 0 || values6_len > 0) 
    148                     {
    149                         _dout("Got values (%d+%d)!\n", values_len / 6, values6_len / 18);
    150                         if(callback) {
    151                             if(values_len > 0)
    152                                 (*callback)(closure, DHT_EVENT_VALUES, sr->id,(void*)values, values_len);
    153 
    154                             if(values6_len > 0)
    155                                 (*callback)(closure, DHT_EVENT_VALUES6, sr->id,(void*)values6, values6_len);
    156                         }
    157                     }
    158                 }
    159             } 
    160             else if(tid_match(tid, "ap", &ttid)) 
    161             {
    162                 struct search *sr;
    163                 _dout("Got reply to announce_peer.\n");
    164                 sr = find_search(ttid, fromAddr->sa_family);
    165                 if(!sr) {
    166                     _dout("Unknown search!\n");
    167                     new_node(id, fromAddr, fromlen, 1);
    168                 } 
    169                 else 
    170                 {
    171                     int i;
    172                     new_node(id, fromAddr, fromlen, 2);
    173                     for(i = 0; i < sr->numnodes; i++)
    174                     {
    175                         if(id_cmp(sr->nodes[i].id, id) == 0) 
    176                         {
    177                             sr->nodes[i].request_time = 0;
    178                             sr->nodes[i].reply_time = nowTime.tv_sec;
    179                             sr->nodes[i].acked = 1;
    180                             sr->nodes[i].pinged = 0;
    181                             break;
    182                         }
    183                     }
    184                     /* See comment for gp above. */
    185                     search_send_get_peers(sr, NULL);
    186                 }
    187             } 
    188             else 
    189             {
    190                 _dout("Unexpected reply: ");
    191                 debug_printable((const unsigned char *)buf, buflen);
    192                 _dout("\n");
    193             }
    194             break;
    195         case PING:
    196             _dout("Ping (%d)!From IP:%s:%d\n", tid_len,inet_ntoa(tempip->sin_addr),tempip->sin_port);
    197             new_node(id, fromAddr, fromlen, 1);
    198             _dout("Sending pong.\n");
    199             send_pong(fromAddr, fromlen, tid, tid_len);
    200             break;
    201         case FIND_NODE:
    202             _dout("Find node!From IP:%s:%d\n",inet_ntoa(tempip->sin_addr),tempip->sin_port);
    203             new_node(id, fromAddr, fromlen, 1);
    204             _dout("Sending closest nodes (%d).\n", want);
    205             send_closest_nodes(fromAddr, fromlen,tid, tid_len, target, want,0, NULL, NULL, 0);
    206             break;
    207         case GET_PEERS:
    208             _dout("Get_peers!From IP:%s:%d\n",inet_ntoa(tempip->sin_addr),tempip->sin_port);
    209             new_node(id, fromAddr, fromlen, 1);
    210             if(id_cmp(info_hash, zeroes) == 0) 
    211             {
    212                 _dout("Eek!  Got get_peers with no info_hash.\n");
    213                 send_error(fromAddr, fromlen, tid, tid_len,203, "Get_peers with no info_hash");
    214                 break;
    215             } 
    216             else 
    217             {
    218                 struct storage *st = find_storage(info_hash);
    219                 unsigned char token[TOKEN_SIZE];
    220                 make_token(fromAddr, 0, token);
    221                 if(st && st->numpeers > 0) 
    222                 {
    223                      _dout("Sending found%s peers.\n",fromAddr->sa_family == AF_INET6 ? " IPv6" : "");
    224                      send_closest_nodes(fromAddr, fromlen,tid, tid_len,info_hash, want,fromAddr->sa_family, st,token, TOKEN_SIZE);
    225                 } 
    226                 else 
    227                 {
    228                     _dout("Sending nodes for get_peers.\n");
    229                     send_closest_nodes(fromAddr, fromlen,tid, tid_len, info_hash, want,0, NULL, token, TOKEN_SIZE);
    230                 }
    231                 if(callback) 
    232                 {
    233                     (*callback)(closure, DHT_EVENT_GET_PEER_VALUES, info_hash,(void *)fromAddr, fromlen);
    234                 }
    235             }
    236 
    237             break;
    238         case ANNOUNCE_PEER:
    239             _dout("Announce peer!From IP:%s:%d\n",inet_ntoa(tempip->sin_addr),tempip->sin_port);
    240             new_node(id, fromAddr, fromlen, 1);
    241 
    242             if(id_cmp(info_hash, zeroes) == 0) 
    243             {
    244                 _dout("Announce_peer with no info_hash.\n");
    245                 send_error(fromAddr, fromlen, tid, tid_len,203, "Announce_peer with no info_hash");
    246                 break;
    247             }
    248             if(!token_match(token, token_len, fromAddr)) {
    249                 _dout("Incorrect token for announce_peer.\n");
    250                 send_error(fromAddr, fromlen, tid, tid_len,203, "Announce_peer with wrong token");
    251                 break;
    252             }
    253             if(port == 0) {
    254                 _dout("Announce_peer with forbidden port %d.\n", port);
    255                 send_error(fromAddr, fromlen, tid, tid_len,203, "Announce_peer with forbidden port number");
    256                 break;
    257             }
    258             if(callback) 
    259             {
    260                 (*callback)(closure, DHT_EVENT_ANNOUNCE_PEER_VALUES, info_hash,(void *)fromAddr, fromlen);
    261             }
    262             storage_store(info_hash, fromAddr, port);
    263             /* Note that if storage_store failed, we lie to the requestor.
    264                This is to prevent them from backtracking, and hence polluting the DHT. */
    265             _dout("Sending peer announced.\n");
    266             send_peer_announced(fromAddr, fromlen, tid, tid_len);
    267         }
    268     }
    269 
    270  dontread:
    271     if(nowTime.tv_sec >= rotate_secrets_time)
    272         rotate_secrets();
    273 
    274     if(nowTime.tv_sec >= expire_stuff_time) {
    275         expire_buckets(buckets);
    276         expire_buckets(buckets6);
    277         expire_storage();
    278         expire_searches();
    279     }
    280 
    281     if(search_time > 0 && nowTime.tv_sec >= search_time) {
    282         struct search *sr;
    283         sr = searches;
    284         while(sr) {
    285             if(!sr->done && sr->step_time + 5 <= nowTime.tv_sec) 
    286             {
    287                 search_step(sr, callback, closure);
    288             }
    289             sr = sr->next;
    290         }
    291 
    292         search_time = 0;
    293 
    294         sr = searches;
    295         while(sr) {
    296             if(!sr->done) {
    297                 time_t tm = sr->step_time + 15 + random() % 10;
    298                 if(search_time == 0 || search_time > tm)
    299                     search_time = tm;
    300             }
    301             sr = sr->next;
    302         }
    303     }
    304 
    305     if(nowTime.tv_sec >= confirm_nodes_time) {
    306         int soon = 0;
    307 
    308         soon |= bucket_maintenance(AF_INET);
    309         soon |= bucket_maintenance(AF_INET6);
    310 
    311         if(!soon) 
    312         {
    313             if(mybucket_grow_time >= nowTime.tv_sec - 150)
    314                 soon |= neighbourhood_maintenance(AF_INET);
    315             if(mybucket6_grow_time >= nowTime.tv_sec - 150)
    316                 soon |= neighbourhood_maintenance(AF_INET6);
    317         }
    318 
    319         /* In order to maintain all buckets' age within 600 seconds, worst
    320            case is roughly 27 seconds, assuming the table is 22 bits deep.
    321            We want to keep a margin for neighborhood maintenance, so keep
    322            this within 25 seconds. */
    323         if(soon)
    324             confirm_nodes_time = nowTime.tv_sec + 5 + random() % 20;
    325         else
    326             confirm_nodes_time = nowTime.tv_sec + 60 + random() % 120;
    327     }
    328 
    329     if(confirm_nodes_time > nowTime.tv_sec)
    330         *tosleep = confirm_nodes_time - nowTime.tv_sec;
    331     else
    332         *tosleep = 0;
    333 
    334     if(search_time > 0) {
    335         if(search_time <= nowTime.tv_sec)
    336             *tosleep = 0;
    337         else if(*tosleep > search_time - nowTime.tv_sec)
    338             *tosleep = search_time - nowTime.tv_sec;
    339     }
    340 
    341     return 1;
    342 }
    dht_periodic
  10. 至于节点如何进行桶操作,调试过一次代码就会明白对应的原理,当然上面也介绍了如何进行桶分裂的原理.
  11. 接下来就是将上面的操作步骤进行循环.

通过上面的流程,了解DHT的工作方法后,如何增加更多的返回信息就需要下一篇的技术性问题的介绍,希望大家一起修改我们的开源程序.

大家有不明白的地方,可以一起讨论.

 

 

大家的推荐才是下一篇介绍的动力...

转载于:https://www.cnblogs.com/miao31/p/3216425.html

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
以下是一个简单的基于Python的DHT算法实现: ```python import hashlib import socket import struct import sys import time BOOTSTRAP_NODES = ( ("router.bittorrent.com", 6881), ("dht.transmissionbt.com", 6881), ("router.utorrent.com", 6881) ) TID_LENGTH = 4 RE_JOIN_INTERVAL = 30 TOKEN_LENGTH = 2 VALUES_LIMIT = 8 def entropy(length): return ''.join(chr(random.getrandbits(8)) for _ in range(length)) def get_neighbor(target, end=False): return target[:-1] + chr(ord(target[-1]) ^ end and 0xff or 0x01) def decode_nodes(nodes): n = [] length = len(nodes) if (length % 26) != 0: return n for i in range(0, length, 26): nid = nodes[i:i+20] ip = socket.inet_ntoa(nodes[i+20:i+24]) port = struct.unpack("!H", nodes[i+24:i+26])[0] n.append((nid, ip, port)) return n class KNode: def __init__(self, nid, ip, port): self.nid = nid self.ip = ip self.port = port class DHT: def __init__(self, bind_ip, bind_port, max_node_qsize): self.bind_ip = bind_ip self.bind_port = bind_port self.max_node_qsize = max_node_qsize self.is_running = False self.nodes = {} self.routing_table = {} def start(self): self.socket = socket.socket(socket.AF_INET, socket.SOCK_DGRAM, socket.IPPROTO_UDP) try: self.socket.bind((self.bind_ip, self.bind_port)) except: print('Bind Error!') sys.exit() print('Listening on %s:%d' % (self.bind_ip, self.bind_port)) self.is_running = True while self.is_running: try: self.tick() data, (ip, port) = self.socket.recvfrom(4096) if data: self.on_message((data, (ip, port))) except: pass def stop(self): self.is_running = False if self.socket: self.socket.close() def tick(self): t = int(time.time()) for node in self.nodes.values(): if node.get('last_seen', 0) + RE_JOIN_INTERVAL < t: self.remove_node(node['id']) def on_message(self, msg): msg_type = ord(msg[0][0]) if msg_type == 0x08: self.on_find_node(msg) elif msg_type == 0x0a: self.on_get_peers(msg) elif msg_type == 0x0f: self.on_announce_peer(msg) def on_find_node(self, msg): tid = msg[0:4] target_id = msg[4:24] node_id = msg[24:] nodes = self.get_nodes(target_id) token = entropy(TOKEN_LENGTH) if nodes: for node in nodes: msg = struct.pack("!20s", node.nid) + socket.inet_aton(node.ip) + struct.pack("!H", node.port) self.socket.sendto(b"\x00\x00\x00\x00" + tid + msg, (node.ip, node.port)) else: self.socket.sendto(b"\x00\x00\x00\x00" + tid + b"d1:rd2:id20:" + self.get_neighbor(target_id).encode() + b"e1:t2:aa1:y1:re", (node.ip, node.port)) def on_get_peers(self, msg): tid = msg[0:4] infohash = msg[4:24] node_id = msg[24:] token = entropy(TOKEN_LENGTH) values = [] if infohash == node_id: values.append((self.bind_ip, self.bind_port)) else: # Fetch values from the DHT storage pass if values: token = entropy(TOKEN_LENGTH) for v in values[:VALUES_LIMIT]: self.socket.sendto(b"\x00\x00\x00\x00" + tid + b"d1:rd2:id20:" + self.get_neighbor(infohash).encode() + b"5:token" + str(len(token)).encode() + b":" + token.encode() + b"5:value" + str(len(v)).encode() + b":" + v.encode() + b"ee", (node.ip, node.port)) else: nodes = self.get_nodes(infohash) if nodes: for node in nodes: msg = struct.pack("!20s", node.nid) + socket.inet_aton(node.ip) + struct.pack("!H", node.port) self.socket.sendto(b"\x00\x00\x00\x00" + tid + b"d1:rd2:id20:" + self.get_neighbor(infohash).encode() + b"e1:t2:aa1:y1:re", (node.ip, node.port)) else: self.socket.sendto(b"\x00\x00\x00\x00" + tid + b"d1:rd2:id20:" + self.get_neighbor(infohash).encode() + b"e1:t2:aa1:y1:re", (node.ip, node.port)) def on_announce_peer(self, msg): pass def get_nodes(self, target): nodes = [] for nid in self.routing_table.get(target, []): if nid in self.nodes: nodes.append(KNode(nid, self.nodes[nid]['ip'], self.nodes[nid]['port'])) return nodes def add_node(self, node_id, ip, port): if node_id not in self.nodes and len(self.nodes) < self.max_node_qsize: self.nodes[node_id] = {'ip': ip, 'port': port, 'last_seen': time.time()} self.routing_table.setdefault(node_id[0], set()) self.routing_table[node_id[0]].add(node_id) def remove_node(self, node_id): if node_id in self.nodes: del self.nodes[node_id] self.routing_table[node_id[0]].remove(node_id) def get_neighbor(self, target): return get_neighbor(target, True) def join_DHT(self): for addr in BOOTSTRAP_NODES: node_id = hashlib.sha1(entropy(20).encode()).digest() self.add_node(node_id, self.bind_ip, self.bind_port) self.socket.sendto(b"\x00\x00\x00\x00" + entropy(TID_LENGTH).encode() + b"\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00" + node_id, addr) def bootstrap(self): while len(self.nodes) < self.max_node_qsize: self.join_DHT() time.sleep(1) ``` 如果您想要更深入地了解DHT算法的实现,建议您参考更为详细的教程和资料。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值