P2P system: GNUTELLA

最新推荐文章于 2024-04-25 15:35:02 发布

weixin_30389003

最新推荐文章于 2024-04-25 15:35:02 发布

阅读量179

点赞数

文章标签：网络

原文链接：http://www.cnblogs.com/yan2015/p/4912617.html

版权

P2P system: GNUTELLA

GNUTELLA是第一个经论证的分布式的peer-to-peer system.

Napster的一个重大问题是涉及到间接侵权，所以GNUTELLA消除the servers altogether然后使用client来search and retrieve,所以client也充当了servers,所以Gnutella的client也叫做servents(由server和client拼接而成)

neighbors意味着这个peer知道它们的IP地址与port number,能够给它们发送message(如TCP).

在peers中创造了一个叫做an overlay graph的graph,之所以称这个图为an overlay graph是因为it's a graph that is overlaid on top of the internet

图中的每条边实质上是底层的internet的internet path,但是考虑到overlay,在underlying internet中的actual path与之不相关,只要这些peers能够talk with each other.

怎么找到指定的文件

五种主要的message type:

query message: 包含着keywords的查询

queryHit: 对query的回应

Ping:为了保证the list of neighbors是最新的，使用ping和pong message

Pont: 为了保证the list of neighbors是最新的，使用ping和pong message

Push: 用于file transfer

以上我们讨论的message的所有的fields除了IP address之外都是使用小端存储

the header for all five kinds of message in GNUTELLA

Descriptor ID: 最开始的16bytes是IDS of this transaction,这个ID号在系统中是唯一的(可以使用IP地址+端口号+递增的序列号来创建).这个Descriptor ID很有用，因为intermediate nodes使用它来识别message that only belong to this particular search不会和其它的search相混淆

payload desciptor: the kind of message

TTL: time to live.当一个message由一个peer传给它的neighbor时，TTL减1,当TTL减到0时，这个message不能再被传输。TTL被设置成为一个有限的数，这样message就不会circulate around the overlay graph forever.The initial TTL is usually set to 7和10之间

Hops: the number of hops that the message has to transmitted to. The hops每传输一次就增加一次。为什么我们在有TTL时还需要Hops呢，这是因为第个peer初始的TTL值可能不一样(一些peer可能initial TTL是7，一些peer可能initial TTL是10).无论怎样，大多数的protocol不会真正的使用hops这个field，它们大多数rely on TTL field

Payload length: message type 不同则length可能不同，length是由the kind of message来决定的.它的作用是你可以根据它来找到message的end

上面Payload部分里面的格式

如果你连了一条100MBps的线路，你可能会说I need peers that are at least 100MBps or at least 10 MBps

查询是怎么send out的

each of immidiate neighbor收到这个search message时，先会查找自己的local files是否有符合查询条件的文件。同时它们flood out the query message to its immidate neighboring peers(刚从它那儿收到query message的那个peer除外).peer怎么知道这个message是刚刚发送过来的呢？是根据descriptor ID,descriptor ID在传输过程中是保持不变的。当TTL减少到0时，message不再进行传输。

每个peer forward 同样的message only once.是怎么做到的呢？是通过keep track of the recent query message that it has forward以及使用descriptor id to find out whether the income message is 重复的message(duplicated message)

当符合查询条件时，发送QueryHit

Num.hits: the number of files that match

QueryHit怎么返回给query的peer

queryHits are reverse routed.每个peer都保存着最近接收到的query message,它同样保存着这个query message是从哪儿传过来的.所以当这个peer创建了a queryHit message或者接收到了a queryHit message,it send a queryHit message back to the peer from which it received corresponding query message from

如何避免重复的传输信息

因为peers的结构可能在传输的过程中发生的改变，这时你可能会收到的收到你之前没有遇到过的message,这是drop这些message,虽然会对发出query的peer要返回的结果有一些影响，但是这种情况少见，且影响也不大。

当query peer接收到back时的反应

request peer发送get message.

range是指定下载的文件的范围，如当你指定range在0-512Byte之间时，文件下载到512byte时，这个下载进程就会中断。你进行第二次下载时可以指定range从512byte开始，这样第二次下载就不会从头开始下载文件，而是从上次中断的地方开始下载。

responder peer 接收到get message后，发送包含请求文件的http消息。

firewall阻止消息进来，但是不会阻止至少大多数消息传出去，但是它不会阻止some kind of message from coming in

如果responder is behind a firewall,则上图中的那个HTTP get message不会传给responder,那么怎么进行file transfer呢？这时就用到了push message

request peer首先发送http message给respond peer，当失败后，它猜想respond peer is behind firewall, 然后它发送push message,这个push message根据queryHit message的path,把这个path进行逆转（along the reverse QueryHit path）,当respond peer接收到这个push message后，它创建一条outgoing TCP connection

Push message

fileindex: 这样就不会使用the entire file name

ip_address和port是发送push message 的request peer的，这样respond peer就可以建立outgoing TCP connection

Dealing with firewall

Ping and Pong message

Ping message像query message一样用flood out来传播消息（也使用TTL,但是TTL一般来说比query message要小）

Num. files shared 和 Num. KB shared用来给pinging peer去选择neighbors which have more data and more files to share

Summary

different peers have different number of neighbors.

Gnutella被发现符合power law 分布: the probability that the number of neighbors you have is L^-k

Problems

最初版本的Gnutella有如上几个问题(已经得到解决)

Ping/Pong traffic的问题： multiplex(当收到几条ping或者pong时，使用一条ping/pong发送出去),使用cache来缓存将几条ping/pong一起发送

Repeated searches with same keywords(如针对比较热门的歌曲)： cache queryHit messages,如果你接收到相同的query时，你可以发送之前保存在cache里面的queryHit message而不是继续forward the query message

一些peers没有足够的带宽：使用central server to act as proxy for such peers.或者使用FastTrack System(利用一些powerful nodes或powerful appliance in the system)

以上的两个问题是所有的peer-to-peer system都存在的问题

Large number of freeloaders: 这个是人性问题，不是技术问题，技术无法解决

Flooding causes excessive traffic: 学术上解决这个问题带来了Structured Peer-to-peer systems，第一个Structured Peer-to-peer system 来自于学术叫做Chord system

转载于:https://www.cnblogs.com/yan2015/p/4912617.html

weixin_30389003

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
P2P system: GNUTELLA

P2P system: GNUTELLAGNUTELLA是第一个经论证的分布式的peer-to-peer system.Napster的一个重大问题是涉及到间接侵权，所以GNUTELLA消除the servers altogether然后使用client来search and retrieve,所以client也充当了servers,所以Gnutella的client也叫做serven...
复制链接

扫一扫