From DistributedSystemsConceptsandDesign (5th Edition)
a distributed system as one in which hardware or software components
located at networked computers communicate and coordinate their actions only by
passing messages.
分布式系统的定义
:分布式系统由通过网络连接的计算机硬件和软件组件组成,并只能通过发送消息的方式进行通信和协调行为。
|
分布式系统的特点:
(1)并发性,各个程序在不同计算机上独立运行
(2)无全局时钟,无法提供精确的全局一致性时钟
(3)独立失败性, 各个软件和硬件模块都有可能失败。比如某台计算机的电源,连接部分计算机的交换机或者某台计算机的系统crash
|
分布式系统的目的:
共享资源。(抽象概念,比如说硬件方面硬盘资源的共享, 软件方面文件或者数据库的共享, 视频共享等)
|
分布式系统的挑战:
(1)异构性
多种网络,计算机硬件,不同操作系统,编程语言和多个开发者。
Middleware
The termmiddlewareapplies to a software layer that provides a
programming abstraction as well as masking the heterogeneity of the underlying
networks, hardware, operating systems and programming languages.
Heterogeneity and mobile code
The termmobile codeis used to refer to program code
that can be transferred from one computer to another and run at the destination – Java
applets are an example. 虚拟机, javascript也属于此范畴
(2)开放性
The openness of distributed
systems is determined primarily by the degree to which new resource-sharing services
can be added and be made available for use by a variety of client programs.
the key interfaces arepublished,
比如RFC, http规范。开发者自己定义的接口
Open systems are characterized by the fact that their key interfaces are published.
Open distributed systems are based on the provision of a uniform communication
mechanism and published interfaces for access to shared resources.
Open distributed systems can be constructed from heterogeneous hardware and
software, possibly from different vendors.
(3)安全性
Security for information resources has three components:
confidentiality(protection against disclosure to unauthorized individuals),
integrity
(protection against alteration or corruption),
通过加密方式来包含confidentiality(公私钥认证)和integrity (对称加密),例子ssh和https
availability
(protection against
interference with the means to access the resources).
Denial of service attacks,操作大量僵尸机***网站 (第三章)
Security of mobile code, 邮件附件中携带的非法程序(第十一章)
(4)扩展性
A system is described asscalableif it will remain
effective when there is a significant increase in the number of resources and the number
of users.
The design of scalable distributed systems presents the following challenges:
a>
Controlling the cost of physical resources
资源和用户成O(n)的线性比例关系
In general, for a system withnusers to be scalable, the
quantity of physical resources required to support them should be at most O(
n
) – that
is, proportional to
n
. For example, if a single file server can support 20 users, then
two such servers should be able to support 40 users.
b>
Controlling the performance loss
随着用户的增加,控制性能损失为O(logn)
DNS最初由单台机器响应所有处理请求。之后改进为层级树形结构。
the time taken to access
hierarchically structured data is O(
log n
), where
n
is the size of the set of data. For a
system to be scalable, the maximum performance loss should be no worse than this.
c>
Preventing software resources running out
IPv4地址被用完。当然过度预估未来增加比迭代更新更差。
d>
Avoiding performance bottlenecks
(全局性)分布式算法需要去中心话以避免性能瓶颈的问题。比如DNS
(局部性)另外对于访问频发的热点资源进行复制和cache来提高大并发使用的性能。
The issue of scale is a dominant
theme in the development of distributed systems.
(第十八章,复制;第二和十二章,cache)
(5)错误处理
进程和网络错误(第二章)
Failures in a distributed system are partial – that is, some components fail while
others continue to function. Therefore the handling of failures is particularly
difficult
a>
Detecting failures,
The challenge is to manage in the presence of failures that
cannot be detected but may be suspected。 比如远程系统crash或者网络拥塞或者OS繁忙
b>
Masking failures,
比如TCP消息重传机制,文件写到多块磁盘作为恢复手段, IP failover, active/standby 切换
c>
Tolerating failures ,
比如在客户端容忍失败,web browser, 重连。 服务器端通过redundancy来容忍失败
d>
Recovery from failures,
坏数据(比如文件)的恢复(校验码), 软件升级后失败(rolled back)
e
>
Redundancy,
复制和冗余。 多个路由器,DNS复制到多个服务器, 数据库复制到多个服务器
The design of effective techniques for keeping replicas of rapidly changing data upto-
date without excessive loss of performance is a challenge. Approaches are
discussed in Chapter 18.
MBTF formula ??(99.999%)
(6)并发性
多用户同时访问资源。需要通过同步(锁)来保证并发(比如多线程)修改共享资源
(7)
Transparency
Transparency is defined as the concealment from the user and the application
programmer of the separation of components in a distributed system, so that the system
is perceived as a whole rather than as a collection of independent components.
透明性是指分布式系统中各个单独模块对用户和应用程序员的封装性。
Access transparencyenables local and remote resources to be accessed using
identical operations.
本地调用和远程调用使用相同的操作(RPC)
Location transparencyenables resources to be accessed without knowledge of their
physical or network location (for example, which building or IP address).
访问资源不需要知道物理网络的知识(不需要知道IP地址或者对应机器)
Concurrency transparencyenables several processes to operate concurrently using
shared resources without interference between them.
不需要显式干涉对共享资源的多个访问
Replication transparencyenables multiple instances of resources to be used to
increase reliability and performance without knowledge of the replicas by users or
application programmers.
对于用户和应用程序员不需要知道访问的资源实例是否为复本
Failure transparencyenables the concealment of faults, allowing users and
application programs to complete their tasks despite the failure of hardware or
software components.
封装错误,使得用户和应用程序在硬件和软件失败的情况下也可以完成所需的任务。
Mobility transparencyallows the movement of resources and clients within a system
without affecting the operation of users or programs.
移动资源而不影响用户和程序操作。
Performance transparencyallows the system to be reconfigured to improve
performance as loads vary.
系统可以支持针对当前负载进行重新配置
Scaling transparencyallows the system and applications to expand in scale without
change to the system structure or the application algorithms.
扩展容量不需要对系统结构和应用算法进行修改
The two most important transparencies are access and location transparency; their
presence or absence most strongly affects the utilization of distributed resources. They
are sometimes referred to together as
network transparency
.
access transparency的例子:访问本地磁盘文件和SMB/NFS 上的磁盘文件。
location transparency的例子:使用URLs来访问web server (可用多个IP指向同一个web地址,DNS load balance)
(8)
Quality of service
The main nonfunctional properties of systems that affect the quality of the
service experienced by clients and users are
reliability
,
security
and
performance
.
Adaptability
to meet changing system configurations and resource availability has been
recognized as a further important aspect of service quality.
performance:
The performance aspect of quality of service was originally defined in terms of
ability tomeet timeliness guarantees,
比如视频播放的实时性
QoS applies to operating
systems as well as networks. Each critical resource must be reserved by the applications
that require QoS, and there must be resource managers that provide guarantees.
Reservation requests that cannot be met are rejected. These issues will be addressed
further in Chapter 20.
通过预留资源来保证资源的可用性
|
分布式系统的实例(www)
Web is an open system.
(1) 操作基于http通信协议和html标准(多种浏览器实现和web服务器实现)
(2) 所共享和发布的资源类型,比如media files (plug-ins来处理不同类型的文件)
The Web has moved beyond these simple data resources to encompass services,
such as electronic purchasing of goods. It has evolved without changing its basic
architecture. The Web is based on three main standard technological components:
a> HTML,
html language for display.
javascript
!!link is king!! no matter that is human or resource
b> URLs,
scheme : scheme-specific-identifier
(ftp://172.24.12.11/delivery/xxx.rpm ,http://www.sina.com.cn,https://www.alibaba.com )
c> HTTP
Request-reply interactions: GET/POSE
Content types,
MIME -> text/html, p_w_picpath/GIF, application/zip
One resource per request (request multiple resources concurrently)
Simple access control (right for access)
基于以上接口实现如下应用类型:
动态页面: CGI to generate dynamic "html" contents but no "html file" from local file system.
Downloaded code: javascript & AJAX, activeX and applet
Web services
: replace HTML to XML, (
REST design scheme)
web的问题
1. 资源的删除和移动会导致链接失效 (通过搜索引擎来查找信息,显示的结果对用户可能会导致困惑, 尝试解决的方法:
semantic web
)
2. 面临scale的问题。第二章介绍浏览器缓存和代理服务器来增加responsiveness
转载于:https://blog.51cto.com/usdaydayup/1384051