Introduction to Networking
To introduce networked computer systems in general, and the Internet in particular: the basic principles that govern their operation, the design and organisation principles of successful computer networks, and the key protocols and technologies that are used in the contemporary Internet.
文章目录
- 1. Introduction to Networking
- 1.1 What’s the Internet?
- 1.2 Network edge and core
- 1.4 Protocol layers and service models
- 1.5 Network security
- 2. Application Layer
- 2.2 Web application
- 3. Application Layer (2)
- 3.1 Domain Name System (DNS)
- 3.1.1 DNS: domain name system
- 3.1.2 DNS: services, structure
- 3.1.3 DNS: a distributed, hierarchical database
- 3.1.4 root name servers
- 3.1.5 TLD, authoritative servers
- 3.1.6 Local DNS name server
- 3.1.7 DNS name resolution example
- 3.1.8 DNS: caching, updating records
- 3.1.9 DNS records
- 3.1.10 DNS protocol, messages
- 3.2 P2P Applications
- 3.3 BitTorrent
- 3.3.3 BitTorrent – Internal Mechanism
- 3.4 Socket programming
1. Introduction to Networking
1.1 What’s the Internet?
1.1.1 Protocols (协议)
Protocols define format, order of messages sent and received among network entities; and actions taken on message transmission, receipt
1.1.2 How to build the Internet world?
-
Infrastructure
- Communication Channel
- Provided by ISP (ChinaTel, ChinaMobile, ChinaUnicom, 3, T-mobile)
-
Computing Service
- Servers or Cloud
- Provided by institutions or Could Service Provider (Aliyun, Amazon…)
-
Applications
- A variety of applications with nice GUI
- Provided by many companies and developers
1.1.3 Standards
IETF
Internet Engineering Task Force
https://www.ietf.org/
https://ietf.org/standards/rfcs/
https://www.rfc-editor.org/in-notes/rfc-index.txt
1.2 Network edge and core
1.2.1 network structure
- Network edge = Host:
- Clients: PCs, Mobile phones, Smart Devices
- Servers: normally hosted in data centers
- Physical media to access networks:
- Wired or wireless communication links
- Network core:
- Interconnected routers
- Network of networks
Access networks and physical media
How to connect end systems to edge router?
- Residential access nets
- Institutional access networks (school, company)
- Mobile access networks
1.2.2 Different ways to access internet
1.2.2.1 Dial-up Internet access (PSTN)
Dial-up Internet access is a form of Internet access that uses the facilities of the public switched telephone network (PSTN) to establish a connection to an Internet service provider (ISP) by dialing a telephone number on a conventional telephone line. Dial-up connections use modems to decode audio signals into data to send to a router or computer, and to encode signals from the latter two devices to send to another modem. Bandwidth: 56 Kbps ~ 10 min for a mp3 music
1.2.2.2 Digital subscriber line (DSL)
DSL是数字用户线路(Digital Subscriber Line)的缩写,它是一种用于提供高速互联网连接的技术。DSL利用电话线(普通的铜线)来传输数字数据,同时保留了用于传统电话通信的能力。DSL技术允许用户同时使用互联网和电话服务,而不必为它们之间的切换而担心。
-
Telephone line based:
- To central office DSL Access Multiplexer (DSLAM)
- Data over DSL phone line goes to Internet
- Voice over DSL phone line goes to telephone net
-
Bandwidth
- Upstream transmission rate < 2.5 Mbps (typically < 1Mbps)
- Downstream transmission rate < 24 Mbps (typically < 10Mbps)
- ADSL = Asymmetric Digital Subscriber Line
1.2.2.3 cable network (TV net based)
意指基于有线电视网络的互联网接入。这种互联网接入方式通常称为有线宽带互联网或有线互联网。
Key technology: Frequency division multiplexing (FDM)
- Different channels transmitted in different frequency bands
HFC: hybrid fiber coax
- Asymmetric: 30Mbps downstream transmission rate, 2 Mbps upstream transmission rate
1.2.2.4 fiber to the home
FTTH using the passive optical networks (PONs) distribution architecture
1.2.2.5 home network
意指家庭网络,是指在一个家庭或住宅内建立的局域网络(LAN)。这个网络通常由多个设备和计算机组成,用于共享互联网连接、文件、打印机、多媒体内容等资源,以便在家庭成员之间进行数据传输和共享。
1.2.2.6 Enterprise access networks (Ethernet)
通常称为以太网网络,是企业或商业组织内部的局域网络(LAN)。这些网络旨在连接组织内的各种计算设备,如计算机、服务器、打印机和网络设备,以促进数据共享、通信和资源访问。
- Typically used in companies, universities, etc.
- 10 Mbps, 100Mbps, 1Gbps, 10Gbps transmission rates
- Today, end systems typically connect into Ethernet switch
1.2.2.7 Wireless access networks
Shared wireless access network connects end system to router
- Via base station, aka “access point”
Wireless LANs:
- within building (100 ft.)
- 802.11b/g/n (WiFi): 11, 54, 450 Mbps transmission rate
- 802.11ax : 9.6 Gbps (MIMO)
Li-Fi (in the future)
- up to 10Gbps
- over the visible light, ultraviolet, and infrared spectrums
Wide-area wireless access
- provided by telco
- Up to 1Gbps or More
- 3G, 4G-LTE, 5G, Starlink
1.2.3 Different medium
1.2.3.1 Physical media
- Guided media:
- signals propagate in solid media: copper, fiber, coaxial cable
- Unguided media:
- signals propagate freely, e.g., radio
twisted pair, coax, fiber
Twisted pair (TP)
- two insulated copper wires
- Category 5: 100 Mbps, 1 Gbps Ethernet
- Category 6: 10Gbps
Coaxial cable:
- two concentric copper conductors
- bidirectional
- broadband:
- multiple channels on cable
Fiber optic cable:
- glass fiber carrying light pulses, each pulse a bit
- high-speed operation:
- high-speed point-to-point transmission (e.g., 10’ s-100’ s Gbps transmission rate)
- low error rate:
- repeaters spaced far apart
- immune to electromagnetic noise
radio
- Signal carried in electromagnetic spectrum
- No physical “wire”
- bidirectional (双向的)
- Propagation(传播) environment effects:
- Reflection
- Obstruction by objects
- Interference
Radio link types:
- Wireless LAN (e.g., WiFi)
- 54 Mbps – 9.6Gbps
- Wide-area (e.g., cellular)
- 4G cellular: ~ 100 Mbps
- 5G cellular: ~ 1Gbps
- Satellite
- Kbps to 45Mbps channel (or multiple smaller channels)
- Starlink ~ 1440Mbps
- 270 msec end-end delay
1.2.4 How to link with others?
1.2.4.1 Network Core
- Mesh of interconnected routers
- Packet-switching:
- Hosts break application-layer messages into small packets
- Packages are forwarded from one router to the next, across links on path from source to destination
1.2.4.2 Two key network-core functions
Routing: determines sourcedestination route taken by packets
- routing algorithms
Forwarding: move packets from router’s input to appropriate router output
1.2.4.3 Packet Switching
store-and-forward
- Takes L/R seconds to transmit (push out) L-bit packet into link at R bps
- Store and forward: entire packet must arrive at router before it can be transmitted on next link
- End-end delay = 2L/R (assuming zero propagation delay)
One-hop example:
-
L = 7.5 Mbits
-
R = 1.5 Mbps
-
one-hop transmission
delay = 5 sec
queueing delay, loss
- If arrival rate (in bits) to link exceeds transmission rate of link for a period of time:
- packets will queue, wait to be transmitted on link
- packets can be dropped (lost) if memory (buffer) fills up
Circuit switching
- Dedicated resources: no sharing
- circuit-like (guaranteed) performance
- Circuit segment is idle if not used by call (no sharing)
- Commonly used in traditional telephone networks
FDM versus TDM
FDM(Frequency Division Multiplexing)
TDM(Time Division Multiplexing)
比较FDM和TDM的主要区别在于资源划分的方式:
- FDM将资源按照频率划分,允许多个信号同时传输,但每个信号占用不同的频率范围。
- TDM将资源按照时间划分,允许多个信号依次传输,每个信号在不同的时间时隙内传输。
Packet switching VS circuit switching
Packet switching allows more users to use network!
- PS advantages:
- resource sharing
- simpler, no call setup
- PS drawbacks:
- excessive congestion possible: delay and loss
- protocols needed for reliable data transfer, congestion control
- How to provide circuit-like behavior PS?
- Bandwidth guarantees
- New methods should be developed
1.2.4.4 Internet: Network of networks
- End systems connect to Internet via access ISPs (Internet Service Providers)
- residential, company and university ISPs
- Access ISPs in turn must be interconnected.
- so that any two hosts can send packets to each other
- Resulting network of networks is very complex
- evolution was driven by economics and national policies
##1.3 Network performance
Package Loss
Delay
Bandwidth
1.3.1 How do loss and delay occur?
- Packages queue in router buffers
- packet arrival rate to link (temporarily) exceeds output link capacity
- then, packets queue, wait for turn
1.3.2 Four sources of packet delay
dproc: nodal processing
- check bit errors
- determine output link
- typically < msec
dqueue: queueing delay
- time waiting at output link for transmission
- depends on congestion level of router
dtrans: transmission delay:
- L: packet length (bits)
- R: link bandwidth (bps)
- dtrans = L/R
dprop: propagation delay:
- d: length of physical link
- s: propagation speed (~2.9x108 m/sec)
- dprop = d/s
1.3.3 Queueing delay
- R: link bandwidth (bps)
- L: packet length (bits)
- a: average packet arrival rate
- La/R ~ 0: avg. queueing delay small
- La/R -> 1: avg. queueing delay large
- La/R > 1: more “work” arriving than can be serviced, average delay infinite!
1.3.4 Packet loss
- Queue (aka buffer) preceding link in buffer has finite capacity
- Packet arriving to full queue dropped (aka lost)
- Lost packet may be retransmitted by previous node, by source end system, or not at all
1.3.5 Throughput
- Throughput: rate (bits/time unit) at which bits transferred between sender/receiver
- instantaneous: rate at given point in time
- average: rate over longer period of time
bottleneck link
link on end-end path that constrains end-end throughput
Internet scenario
- Per-connection end-end throughput: min{Rc , Rs , R/10}
- In practice: Rc or Rs is often bottleneck
1.4 Protocol layers and service models
1.4.1 Internet protocol stack
Internet protocol stack
- application: supporting network applications
- FTP, SMTP, HTTP
- transport: process-process data transfer
- TCP, UDP
- network: routing of datagrams from source to destination
- IP, routing protocols
- link: data transfer between neighboring network elements
- Ethernet, 802.11 (WiFi), PPP
- physical: bits “on the wire”
ISO/OSI reference model
- ISO/OSI = ISO/Open System Interconnection
- presentation (表示层): allow applications to interpret meaning of data, e.g., encryption, compression, machine-specific conventions
- session (会话层): synchronization, checkpointing, recovery of data exchange
- Internet stack “missing” these layers!
- these services, if needed, must be implemented in application
Why layering?
- Divide complex systems to simple components
- Easy for maintenance
- Flexible for updating
1.5 Network security
- Field of network security:
- how bad guys can attack computer networks
- how we can defend networks against attacks
- how to design architectures that are immune to attacks
- Internet not originally designed with (much) security in mind
- Original vision: “a group of mutually trusting users attached to a transparent network”
- Internet protocol designers playing “catch-up ”
- Security considerations in all layers!
Types
- Malware: virus, worm, spyware
- DDoS: Distributed denial of service attack
- Packet “sniffing”
2. Application Layer
###2.1.1 Principle of network application
How to create a network app?
- Write programs that:
- Run on different end systems
- Communicate over network
- eg., browser (edge, chrome, firefox…) communicate with web servers (www.xxx.com)
- No need to write software for network-core devices
- Network-core devices do not run user app (exclude some smart routers)
- Applications on end systems allows for rapid app development, propagation
Architectures for applications
Client-server
Peer-to-peer
2.1.2 Client-server architecture
- Server
- Always-on host
- Permanent IP address
- High performance / Distributed computing
- Server process: waits to be contacted (Listen)
- Clients
- Link to the server for service
- May be intermittently connect to the internet
- Dynamic IP address
- Do not communicate directly with each other
- Client process: initiates communication
2.1.3 P2P architecture
- No always-on server is needed
- End systems directly exchange data
- Client process / server process on the same host
- Peers request service from other peers, provide service in return to other peers
- Self scalability – new peers bring new service capacity, as well as new service demands
- Peers are intermittently connected
- Dynamic IP addresses
2.1.4 How to communicate over the network?
Sockets (套接字)
- Process sends/receives messages to/from its socket
- Socket analogous to door
- Sending process shoves message out door
- Sending process relies on transport infrastructure on other side of door to deliver message to socket at receiving process
2.1.5 Addressing processes
在计算机网络中,为了能够接收消息,每个进程都需要一个唯一的标识符。这个标识符通常由两部分组成:主机标识符和端口号,用于定位特定主机上的特定进程。IPv4和IPv6是两种不同的主机标识符,端口号用于标识特定进程。
- To receive messages, process must have identifier
- Host device has unique 32-bit IPv4 and/or 128-bit IPv6
- Process network identifier:
- IPv4:port 192.168.1.100:80
- [IPv6]:port [240e:3a1:4cb1:69d0:f40c:4269:74a2:7ea3]:80
2.1.6 App-layer protocol defines
- Types of messages exchanged
- e.g., request, response
- Message syntax:
- what fields in messages & how
fields are delineated
- what fields in messages & how
- Message semantics:
- meaning of information in fields
- Message timing:
- when and how
Open protocols:
这些协议是公开标准,通常由国际组织或开发者社区定义,并以RFC (Request for Comments)等文档形式发布。开放协议的特点是它们可以被任何人实现,从而促进了互操作性。一些例子包括HTTP (用于网页通信),SMTP (用于电子邮件传输),FTP (用于文件传输) 等。
- Defined in RFCs
- Allows for interoperability
- e.g., HTTP, SMTP, FTP
Proprietary protocols:
这些协议由特定组织或公司开发,通常不公开标准,因此只有特定的软件或系统可以使用。一些例子包括Skype (用于即时通讯),某些在线游戏的通信协议,以及某些公司内部使用的专有通信协议。
- e.g., Skype, Games, you
own protocols…
2.1.7 Transport service requirements
Data Integrity
- 100% reliable data transfer
or
- Tolerate some loss
Throughput
- Some apps require minimum amount of throughput to be “effective”
- Others do not require
Timing
- Some apps require low delay
Security
- Some apps require encryption
- Data integrity check
common apps
Applications | Data loss | Throughput | Time sensitive |
---|---|---|---|
File transfer | No loss | Elastic | No |
No loss | Elastic | No | |
Web documents | No loss | Elastic | No |
Real-time video/audio | Loss-tolerant | Based on quality* | Yes 100 ms |
Stored video/audio | Loss-tolerant | Based on quality* | Yes few second |
Interactive games | Loss-tolerant | Few kpbs | Yes 100 ms |
Text messaging | No loss | Elastic | Yes or No |
*Audio: 5k to 1Mpbs, Video: 10kpbs - 10 Mbps or more
2.1.8 Internet transport protocols services
TCP service:
- Reliable transport between sending and receiving process
- Flow control: sender won’t overwhelm receiver
- Congestion control: throttle sender when network overloaded
- Does not offer: timing, minimum throughput guarantee, security
- Connection-oriented: setup required between client and server processes
UDP service:
- Unreliable data transfer between sending and receiving process
- Does not offer: reliability, flow control, congestion control, timing, throughput guarantee, security, or connection setup,
Application vs transport protocols
Application | App layer protocol | Underlying transport Protocol |
---|---|---|
SMTP [RFC 2821], POP3, IMAP | TCP | |
Remote terminal Access | Telnet [RFC 854], SSH | TCP |
Web | HTTP [RFC 2616] HTTPS | TCP |
File Transfer | FTP [RFC 959], SFTP | TCP |
Multimedia | HTTP / RTP [RFC 1889] | TCP or UDP |
VoIP | SIP, RTP or proprietary | TCP or UDP |
2.1.9 Securing TCP - Secure Sockets Layer - SSL
TCP & UDP
• No encryption
• Cleartext psws -> Internet
SSL
• Provides encrypted TCP
connection
• Data integrity
• End-point authentication
SSL is at app layer
• Apps use SSL libraries, that
“talk” to TCP
SSL socket API
• Cleartext psw -> encrypted psw
-> Internet
• Lecture 11/12 will talk more
2.2 Web application
2.2.1 Web, HTTP and WWW
-
WWW: World Wide Web
- WWW(万维网)是互联网上的一个巨大信息资源和服务集合。它是一种通过超链接(链接)将不同文档和资源相互连接的系统,使用户能够轻松地在网上浏览和访问信息。
-
HTTP: Hypertext Transfer Protocol
- HTTP(超文本传输协议)是一种用于在万维网上传输超文本文档(如网页)的通信协议。它是客户端和服务器之间进行数据交换的协议,允许用户在浏览器中请求和获取网页。
-
Web page consists of base HTML-file which includes several referenced objects, addressable by a URL (uniform resource locator)
-
Web页面是构成网站的基本元素,通常由一个HTML(超文本标记语言)文件构成,该文件包含了网页的文本内容、结构和格式。Web页面可以包含文本、图像、视频、链接等多种媒体元素。
-
Web页面通常还包括其他通过URL(统一资源定位符)引用的对象,这些对象可以是网页中的图片、样式表、JavaScript文件等。这些URL可以使浏览器请求并加载其他资源,以完整显示网页。
-
URL:
https:(protocol)//www.google.com(hostname[:port])/images/branding/googlelogo/1x/googlelogo_color_272x92dp.png(path to a file)
2.2.2 HTTP
HTTP: hypertext transfer protocol (超⽂本传输协议)
- Application layer protocol
- Client/server model
- Client: browser that requests, receives, (using HTTP protocol) and show Web objects (Render)
- Server: Web server sends (using HTTP protocol) objects in response to requests
Uses TCP:
- Client initiates TCP connection (creates socket) to server, port 80(443 for https)
- Server accepts TCP connection from client
- HTTP messages (application-layer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server)
- TCP connection closed
HTTP is “stateless”
- server maintains no information about past client requests
aside
Protocols that maintain “state” are complex!
- past history (state) must be maintained
- if server/client crashes, their views of “state” may be inconsistent, must be reconciled
HTTP connections
Non-persistent HTTP
- At most one object sent over TCP connection
- connection then closed
- Downloading multiple objects required multiple connections
Persistent HTTP
- Multiple objects can be sent over single TCP connection between client, server
Non-persistent HTTP: response time
RTT (Round Trip Time): time for a small packet to travel from client to server and back round trip time
HTTP response time:
- One RTT to initiate TCP connection
- One RTT for HTTP request and first few bytes of HTTP response to return
- File transmission time
- Non-persistent HTTP response time = 2RTT+ file transmission time
Persistent HTTP
Non-persistent HTTP issues:
- Requires 2 RTTs per object
- OS overhead for each TCP connection
- Browsers often open parallel TCP connections to fetch referenced objects
Persistent HTTP issues:
- Server leaves connection open after sending response
- Subsequent HTTP messages between same client/server sent over open connection
- Client sends requests as soon as it encounters a referenced object
- As little as one RTT for all the referenced objects
2.2.3 HTTP request message: general format
[HTTP Method] [Request-URI] [HTTP Version]
[Headers]
[Blank Line]
[Message Body]
- HTTP Method (HTTP方法):HTTP请求消息的第一部分是HTTP方法,它表示客户端希望执行的操作。常见的HTTP方法包括:
GET
:用于请求获取指定资源的信息。POST
:用于提交数据给服务器,通常用于创建新资源。PUT
:用于更新指定资源的信息。DELETE
:用于请求删除指定资源。- 等等。
- Request-URI (请求的资源标识符):这是客户端希望访问的资源的标识符,通常是一个URL(Uniform Resource Locator)。它包括了服务器主机名、端口号(可选的,默认为80)、资源路径和可能的查询字符串。例如:
/example/resource
。 - HTTP Version (HTTP协议版本):这是HTTP协议的版本号,用于指示客户端使用的协议版本。常见的版本包括HTTP/1.0和HTTP/1.1。
- Headers (头部):这一部分包含HTTP请求消息的头部信息,每个头部以键值对的形式表示,键和值之间用冒号分隔。头部包括了各种元信息,如
Host
头部(指定服务器的主机名)、User-Agent
头部(指定客户端的用户代理)、Accept
头部(指定可接受的媒体类型)、Cookie
头部(包含客户端的Cookie信息)等。头部可以有多个,每个头部一行。 - Blank Line (空行):请求头部与请求体之间需要有一个空行来分隔它们。
- Message Body (消息体):这是可选的部分,通常包含请求中的数据,例如在POST请求中,消息体可以包含要提交给服务器的数据。对于GET请求,通常没有消息体。
总之,HTTP请求消息的一般格式包括了HTTP方法、请求的资源标识符、HTTP协议版本、头部信息、空行和消息体(可选)。这些元素一起构成了客户端向服务器发出的请求,以便获取或操作特定资源。服务器收到请求后会根据请求中的信息来执行相应的操作并返回响应消息。
2.2.4 HTTP response status codes
- Status code appears in 1st line in server-to-client response message.
- Some sample codes:
- 200 OK
- 301 Move Permanently
- 400 Bad Request
- 404 Not Found
- 505 HTTP Version Not Supported
2.2.5 cookies
- Many Web sites use cookies
- Four components:
- cookie header line of HTTP response message
- cookie header line in next HTTP request message
- cookie file kept on user’s host, managed by user’s browser
- back-end database at Web sit
Cookies(Cookie,复数形式为Cookies)是一种用于在Web浏览器和Web服务器之间传递信息的小型文本文件。它们是Web开发中常用的机制,用于跟踪和维护用户的会话状态、身份验证信息以及其他用户相关的数据。
Cookies的工作原理如下:
- 服务器创建Cookie:当用户首次访问一个网站时,服务器可以在HTTP响应中的
Set-Cookie
头部中包含一个或多个Cookie。这些Cookie通常包含有关用户会话的信息,例如会话ID、用户首选项、购物车内容或其他数据。Cookie的内容以键值对的形式表示。 - 浏览器存储Cookie:一旦服务器发送了Cookie,浏览器会将它们存储在用户的本地计算机上的Cookie文件中。每个Cookie都有一个名称和一个关联的值,以及一些其他属性,如过期时间、路径、域名等。
- 浏览器发送Cookie:当用户在随后的请求中再次访问同一网站的不同页面时,浏览器会自动在HTTP请求中包括一个
Cookie
头部,将之前存储的Cookie信息发送回服务器。 - 服务器读取Cookie:服务器在接收到包含Cookie的请求时,可以读取Cookie的内容,并根据其中的信息识别用户、维护用户状态或提供个性化的体验。
Cookies的主要用途包括:
- 会话管理:Cookies用于跟踪用户会话,允许用户在不同的页面之间保持一致的状态,而无需在每个请求中重新验证身份。
- 用户身份验证:Cookies通常用于存储用户身份验证信息,以便用户在一段时间内保持登录状态。
- 个性化体验:Cookies可以存储用户首选项、购物车内容和其他与用户相关的数据,以提供个性化的用户体验。
尽管Cookies是广泛使用的机制,但它们也有一些限制和安全性考虑。例如,Cookies在客户端存储,因此可以被修改或删除。为了提高安全性,网站通常会使用HTTPS来加密Cookie传输,并采取其他安全措施来防止滥用。此外,一些用户可能会选择在浏览器中禁用Cookies或限制其使用,因此开发人员需要设计应用程序以适应这些情况。
2.2.6 Web Caches
Goal: satisfy client request without involving origin server
-
User sets browser: Web accesses via cache
-
Browser sends all HTTP requests to cache
- object in cache: cache returns object
- else cache requests object from origin server, then returns object to client
-
Cache acts as both client and server
- server for original requesting client
- client to origin server
-
Typically cache is installed by ISP (university, company, residential ISP)
Why Web caching?
- reduce response time for client request
- reduce traffic on an institution’s access link
- Internet dense with caches: enables “ poor ” content providers to effectively deliver content (so too does P2P file sharing)
3. Application Layer (2)
3.1 Domain Name System (DNS)
域名系统(DNS)是一种用于将人类可读的域名(例如www.example.com)映射到计算机网络中的IP地址(例如192.168.1.1)的分布式命名系统。DNS 是互联网上的一项关键技术,它允许人们使用易记的域名来访问网站、发送电子邮件、进行文件传输和执行其他网络活动,而无需记住每个目标计算机的复杂数字 IP 地址。
Internet hosts:
- IP address (32 bit for IPv4) - used for addressing datagrams
- IP地址是用于在互联网上唯一标识计算机或设备的32位(IPv4)或128位(IPv6)数字。它用于在网络上寻址数据包,以便将它们传递到正确的目的地。每台连接到互联网的计算机都必须具有唯一的IP地址。
- “Name”, eg. www.xjtlu.edu.cn - used by humans
- 在互联网上,名称通常指的是域名,如www.xjtlu.edu.cn。域名是人们更容易记忆和理解的方式来标识互联网主机,而不是复杂的IP地址。DNS(域名系统)用于将域名映射到相应的IP地址,以便在互联网上查找和访问资源。
3.1.1 DNS: domain name system
- Application-layer protocol:
- C/S architecture
- UDP (port 53)
- hosts, name servers communicate to resolve names (name / address translation)
- Distributed database implemented in hierarchy of many name servers
3.1.2 DNS: services, structure
- DNS services
- Hostname to IP address translation(A)
- Host aliasing(cname)
- canonical, alias names
- Mail server aliasing(mx)
- Load distribution
- Replicated Web servers: many IP addresses correspond to one name
Why not centralize DNS?
- Single point of failure
- Traffic volume
- Distant centralized database
- Maintenance
3.1.3 DNS: a distributed, hierarchical database
3.1.4 root name servers
- Contacted by local name server that can not resolve name
- Root name server:
- Contacts authoritative name server if name mapping not known
- Gets mapping
- Returns mapping to local name server
3.1.5 TLD, authoritative servers
- Top-level domain (TLD) servers(顶级域名):
- Responsible for com, org, net, edu, aero, jobs, museums, and all Top-level country domains, e.g.: cn, uk, fr, ca, jp
- Eg.:
- Network Solutions maintains servers for .com TLD
- Educause for .edu TLD (https://net.educause.edu/)
- Authoritative DNS servers(权威域名服务器):
- Organization’s own DNS server(s), providing authoritative hostname to IP mappings for organization’s named hosts
- Can be maintained by organization or service provider
3.1.6 Local DNS name server
- Does not strictly belong to hierarchy
- Each ISP (residential ISP, company, university) has one
- Also called “default name server”
- When host makes DNS query, query is sent to its local DNS server
- Has local cache of recent name-to-address translation pairs (but may be out of date!)
- Acts as proxy, forwards query into hierarchy
3.1.7 DNS name resolution example
- Host at XJTLU wants IP address for www.feimax.com
- Iterated query:
- contacted server replies with name of server to contact
- “I don’t know this name, but ask this server”
- Recursive query:
- Puts burden of name resolution on contacted name server
- Heavy load at upper levels of hierarch
3.1.8 DNS: caching, updating records
- Once (any) name server learns mapping, it caches mapping
- Cache entries timeout (disappear) after some time (TTL)
- TLD servers typically cached in local name servers -
- thus root name servers not often visited
- Cached entries may be out-of-date
- If name host changes IP address, may not b e known Internet-wide until all TTLs expire
- Update/notify mechanisms proposed IETF standard
- RFC 2136
3.1.9 DNS records
DNS: distributed database storing resource records (RR)
RR format: (name, value, type, ttl)
type=A
- name is hostname
- value is IP address
type=NS
- name is domain (e.g., foo.com)
- value is hostname of authoritative name server for this domain
type=CNAME
- name is alias name for some “canonical” (the real) name
- www.taobao.com is really www.taobao.com.danuoyi.tbcache.com
- value is canonical name
type=MX
- value is name of mailserver associated with name
3.1.10 DNS protocol, messages
- Query and reply messages, both with same message format
Message header
- identification: 16 bit # for query, reply to query uses same #
- flags:
- query or reply
- recursion desired
- recursion available
- reply is authoritative
3.2 P2P Applications
3.2.1 Pure P2P architecture
- No always-on server
- Arbitrary end systems directly communicate
- Peers change IP addresses
Examples:
-
file distribution (BitTorrent)
-
Streaming (KanKan)
-
VoIP (Sky)
3.2.2 File distribution: client-server vs P2P
File distribution time: client-server
-
server transmission: must sequentially send (upload) N file copies:
- time to send one copy: F / u s F/u_s F/us
- time to send N copies: N F / u s NF/u_s NF/us
-
client: each client must download file copy
-
d m i n d_{min} dmin = min client download rate • max client download time: F / d m i n F/d_{min} F/dmin
time to distribute F to N clients using client-server approach:
D c − s > max { N F / u s , , F / d m i n } D_{\mathrm{c-s}}>\max\{\mathbb{N}F/u_{s,},F/d_{min}\} Dc−s>max{NF/us,,F/dmin}
↑increases linearly in N
File distribution time: P2P
- server transmission: must sequentially send (upload) at least one file copies:
- time to send one copy: F / u s F/u_s F/us
- client: each client must download file copy
- min client download time: F / d m i n F/d_{min} F/dmin
- clients: as total must download NF bits
- max upload rate (limiting max download rate) is u s + Σ u i \mathrm{u}_s+\Sigma\mathrm{u}_\mathrm{i} us+Σui
time to distribute F to N clients using P2P approach
D P 2 P > max { F / u s , , F / d m i n , , N F / ( u s + Σ u i ) } D_{P2P}>\max\{F/u_{s,},F/d_{min,},NF/(u_s+\Sigma u_i)\} DP2P>max{F/us,,F/dmin,,NF/(us+Σui)}
Client-server vs. P2P: example
c l i e n t u p l o a d r a t e = u , F / u = 1 h o u r , u s = 10 u , d m i n ≥ u s \mathrm{client~upload~rate}=u,~F/u=1\mathrm{~hour},~u_s=10u,~d_{min}\geq u_s client upload rate=u, F/u=1 hour, us=10u, dmin≥us
3.3 BitTorrent
- Efficient content distribution system using file swarming.
- The throughput increases with the number of downloaders via the efficient use of network bandwidth.
Peers(对等节点) in torrent send/receive file pieces(chunks)
- To share a file or group of files, the initiator first creates a .torrent file, a small file that contains:
- Metadata about the files to be shared
- Information about the tracker, the computer
- that coordinates the file distribution
- Downloaders first obtain a .torrent file, and then connect to the specified tracker, which tells them from which other peers to download the pieces of the file.
Metadata of .torrent
- SHA-1 hashes of all pieces
- A mapping of the pieces to files
- Piece size
- Length of the file
- A tracker reference
Peer joining torrent:
- has no pieces, but will accumulate them over time from other peers
- registers with tracker to get list of peers, connects to subset of peers (“neighbors”)
- While downloading, peer uploads pieces to other peers
- Peer may change peers with whom it exchanges pieces
- Peers may come and go
- Once peer has entire file, it may (selfishly) leave or remain in torrent
Leecher -> Seeder
- As soon as a leecher has a complete piece, it can potentially share it with other downloaders.
- Eventually each leecher becomes a seeder by obtaining all the pieces, and assembles the file. Verifies the “checksum” of the file.
3.3.1 Piece selection policy
- The order in which pieces are selected by different peers is critical for good performance.
- If an inefficient policy is used, then peers may end up in a situation where each has all identical set of easily available pieces, and none of the missing ones.
- If the original seed is prematurely taken down, then the file cannot be completely downloaded!
3.3.2 Piece selection
- Rarest First
- General rule
- Random First Piece
- Special case, at the beginning
- Endgame Mode
- Special case
Random First Piece
- Initially, a peer has nothing to trade
- Important to get a complete piece ASAP
- Select a random piece of the file and download it
Rarest First
- Determine the pieces that are most rare among your peers, and download those first.
- This ensures that the most commonly available pieces are left till the end to download.
Endgame Mode
- Near the end, missing pieces are requested from every peer containing them.
- This ensures that a download is not prevented from completion due to a single peer with a slow transfer rate.
- Some bandwidth is wasted, but in practice, this is not too much.
3.3.3 BitTorrent – Internal Mechanism
Built-in incentive mechanism (where all the magic happens):
- Choking Algorithm
- Optimistic Unchoking
Choking
Choking is a temporary refusal to upload. It is one of BT’s most powerful idea to deal with free riders (those who only download but never upload).
- For avoiding free riders and avoiding network congestion
Tit-for-tat strategy is based on game-theoretic concepts.
Optimistic unchoking
- A peer sends pieces to those four peers currently sending her chunks at highest rate
- other peers are choked by Alice (do not receive chunks from her)
- re-evaluate top 4 every10 secs
- Every 30 secs: randomly select another peer, starts sending chunks
- “optimistically unchoke” this peer
- newly chosen peer may join top 4
- Reasons:
- To discover currently unused connections that are better than the ones being used
- To provide minimal service to new peers
3.3.4 Upload-Only mode
- Once download is complete, a peer can only upload. The question is, which nodes to upload to?
- Policy: Upload to those with the best upload rate. This ensures that pieces get replicated faster, and new seeders are created fast
3.4 Socket programming
套接字(Socket):
-
套接字是一种通信机制,它充当了应用程序进程与底层的端到端传输协议之间的接口或门户。套接字允许不同的计算机或不同的应用程序进程之间在网络上进行通信。
-
套接字是网络编程中的核心概念,它提供了一种标准化的方式,使应用程序能够创建、发送和接收数据,从而实现网络通信。
-
套接字可以看作是两个主机之间通信的端点,它包括了通信所需的地址和端口等信息。
-
Two socket types for two transport services:
- UDP: unreliable datagram
- TCP: reliable, byte stream-oriented
-
Application Example:
- client reads a line of characters (data) from its keyboard and sends data to server
- server receives the data and converts characters to uppercase
- server sends modified data to client
- client receives modified data and displays line on its screen
3.4.1 Socket programming with UDP
- UDP: no “connection” between client & server
- No handshaking before sending data
- Sender explicitly attaches IP destination address and port# to each packet
- Receiver extracts sender IP address and port# from received packet
- UDP: transmitted data may be lost or received out-of-order
- Application viewpoint:
- UDP provides unreliable transfer of groups of bytes (“datagrams”) between client and server
Client/server socket interaction: UDP
Example app: UDP server
Example app: UDP client
3.4.2 Socket programming with TCP
Client must contact server
- Server process must first be running
- Server must have created socket (door) that welcomes client’ s contact
client contacts server by:
- Creating TCP socket, specifying IP address, port number of server process
- When client creates socket: client TCP establishes connection to server TCP
When contacted by client, server TCP creates new socket for server process to communicate with that particular client
- Allows server to talk with multiple clients
- Source port numbers used to distinguish clients
Application viewpoint:
TCP provides reliable, in-order byte-stream transfer (“pipe”) between client and server
Client/server socket interaction: TCP
Example app: TCP server
Example app: TCP client