分布式&移动系统1

An automatic teller machine network enables bank customers to withdraw cash from their bank account.
Banks and building societies maintain large networks of teller machines.
Customer have high security, privacy and reliability requirements.
Customers may want to withdraw cash from their account through a ´foreign´ teller machine.
A front-end computer controls one or several tellers. It
transfers withdrawal requests to the computer of the account holder´s bank,
awaits the bank granting the request, and
therefore has to be interoperable with heterogeneous computer systems (Hang Seng Bank may have different account management systems than HongKong Bank and Bank of China).
Each bank has fault-tolerant systems to quickly recover from failures of their account holding computers. An example is the ´Hot standby´ computer which maintains a copy of the account database and can replace the main computer within seconds.

A Web browser is a user interface to the world´s biggest distributed system, the Internet.
A Web page includes links to other Web pages. These links are specified as URLs.
An URL is the name of a protocol (ftp, http, etc.), the name of a site (gateway1.cse.cuhk.edu.hk) and the name of a file.
To follow a link to a remote Web page, your Web browser
talks to the local name server to resolve the symbolic site name into an IP address (137.189.89.153).
talks to the http daemon running on that web site and requests the delivery of the Web page addressed by the URL.
To obtain a file from a remote ftp site, your Web browser
resolves the site name with the local name server
talks to the ftp daemon running on that site and performs an anonymous login.
switches the daemon into an appropriate transfer mode and
obtains the file addressed by the file addressed in the URL.
To send an e-mail, your Web browser
opens a new dialog window where you can enter the addressee(s) and the e-mail text
talks to the local sendmail daemon to have it delivering the e-mail to the sendmail daemons on the sites of your addressees.

Why do we bother about constructing distributed systems? Constructing a centralized system appears to be much easier!
Some properties of a distributed system cannot be achieved by a centralized system. It is worthwhile to keep these properties in mind during the design or assessment of a distributed system.
Heterogeneity: I can access all the documents that are available on the Internet, even though the documents are located in different type of machines.
Openness: I have credit cards from Hang Seng Bank and Wells Fargo Bank in U.S.A. and can use them at each others tellers. These banks, however, would never develop a common centralized teller system. It is because their systems are open and interoperable that I have this flexibility.
Security: I want to purchase products in e-Commerce. I don’t want other people to steal my credit card number.
Scalability: Distributed systems, such as the Internet, grow each day to accommodate more users and to withstand higher load. (Hong Kong stock trading broker is on-line and you can open accounts and do on-line trading from home PC).
Failure Handling: Two (distributed) account databases are managed by the bank to quickly recover from a break-down.
Concurrency: Multiple database users can concurrently access and update data in a distributed database system. The database system preserves integrity against concurrent updates and users perceive the database as their own copy. They are, however, able to see each others changes after they have been completed.
Transparency: When using a distributed system it appears to users as if it were centralized.

The Internet enables users to access services and run applications over a heterogeneous collection of computers and networks.
Heterogeneity applies to
Networks
Computer hardware
Operating systems
Programming languages
Implementations by different developers
Differences of heterogeneous components in a distributed system have to be resolved
Differences in data type representation regarding, for example, byte ordering of integers
Different APIs of different operating systems for the implementation of the Internet protocols
Different programming languages use different representations for characters and data structures such as arrays and records
Different programmers have to common standards for communication purpose
Middleware helps to solve the problems of heterogeneity. It also provides a uniform computational model for use by the programmers of servers and distributed applications.
Mobile code is used to be run on heterogeneous computers. To get it done, the virtual machine approach provides a way of making code executable on any hardware: the compiler for a particular language generates code for a virtual machine instead of a particular hardware order code.
Mobile apps, running on mobile OS (iOS, Android), are distributed on vast smart phones.

Openness tries to address the following question: How difficult is it to extend and improve a system.
Most often functional extensions and improvements require new components to be added.
These components may have to use the services provided by existing components.
Hence, the static and dynamic properties of services provided by components have to be published in detailed interfaces.
The new components have to be integrated into existing components, so that the added functionality becomes accessible from the distributed system as a whole.
Components may not always be running on the same platforms. Hang Seng Bank, HongKong Bank, and Bank of China almost certainly do not have the same type of hosts, it´s quite likely they use different programming languages and have different networks. Still their automatic teller machines have to be integrated.
To achieve such an heterogeneous integration, often different data representation formats have to be integrated. If components running on a Windows-3.x PC have to be integrated with components running on a Sun SparcStation, short integers on the Sun have 64 bit, while they only have 16 bit on the PC. Some mainframes revert the order in which 2 byte numbers are stored, most don´t.

Many needs for security exist for distributed systems for data secrecy and personal privacy.
Security is also required for authentication purpose.
Many challenges in security exist in modern distributed systems. Denial of service attack and mobile code (e.g., mobile apps) security are two examples.

Centralized systems often create bottlenecks as soon as a certain number of users are reached.
Distributed systems can be built in a way that these bottlenecks are avoided.
Then new processors can be added to accommodate new users. The Internet grows every day by adding new sites.
Other internet sites are not affected by these additions. They do not have to be changed.
However, components in distributed systems have to be designed in a way that the overall system remains scalable.
Sometimes it is required to relocate components, i.e. to migrate them to new processors. Relocation is required to populate new processors with components and to remove a certain amount of load from existing processors.
Then it is important that no or only very little assumptions are made on the location of components, both within the component itself and also within other components that use the component. Otherwise these components having explicit location information have to be changed whenever a component is relocated.

Hardware, software and networks are not free of failures. They fail either because of software errors, failures in the supporting infrastructure (power-supply or air-conditioning), mis-use of their users or just because of aging hardware. The average life time of hard disks are between two and five years, much less than the average life-time of a distributed system.
Given that there are many processors in a distributed system, it is much more likely that one of them fails than it is that a centralized system fails.
Distributed system, therefore, have to be built in a way that they continue to operate, even in the presence of failures of some of its components.
A distributed system can even achieve a higher reliability than a centralized system if distribution and replication is exploited properly.
Two different means have to be deployed to achieve fault tolerance: recovery and redundancy.
Components that are able to recover from failures have been built in a way that they react in a controlled way if they rely on services of components that have failed.
Redundant hardware, software and data decreases the time that is needed after a failure to bring a system up again.

Components in distributed systems are executed concurrently. There may be many different people at different teller machines. Likewise, there are many different users working in a local area network.
While these components access shared resources, the resources have to be protected against integrity violations that may be introduced through concurrency.
As an example for a lost update, consider that you withdraw 50 dollars. This requires the bank´s account database to compute:
debitbalance = balance-50; /* Op1 /
balance = debitbalance; /
Op2 /
If a clerk in the bank credits a check of 100 dollars the following computation has to be done:
creditbalance = balance+100; /
Op3 /
balance = creditbalance; /
Op4 */
If these two modifications to your account are done concurrently the integrity of the account data may be violated in two ways:

  1. your debit may not be recorded (bad luck for the bank) if the schedule is (Op1, Op3, Op2, Op4).
  2. the credit of your check may not be recorded (bad luck for you) if the schedule is (Op3, Op1, Op4, Op2).
    These situations have by all means to be avoided.
    Concurrency control facilities (such as locking) are needed in almost any concurrent system.

The complexity of distributed systems should be hidden from their users. They should not have to be aware whether the system they are using is centralized or distributed. Thus, it is transparent for the user that s/he is using a distributed system.
For the administrators of the system, however, this is not true. For them, it may well be important (e.g. during load balancing) to know which component resides on which machine.
To make life easier for an application programmer, s/he should also not have to be aware that s/he is using distributed components.
You have certainly developed a program on a CSE machine where you had to use the file system. You were able to use the same library for file access irregardless whether the files were stored on local or remote file systems. Most likely, you were storing files on remote disks, however, and you may even not have been aware of that. Thus distribution was both access and location transparent for you as an application programmer.
There are many aspects of transparency, which we will discuss now. Transparency is, in fact, orthogonal to the other characteristics that we have discussed so far and applies to most of them. We will, therefore, have a closer look now at access transparency, location transparency, concurrency transparency, replication transparency, failure transparency, mobility transparency, performance transparency and scaling transparency.

Access transparency means that the operations or commands that are used for accessing objects are identical irregardless whether local or remote data is being accessed.
Examples of access transparency are:
Users of UNIX NFS can use the same commands for copying, moving and deleting files irregardless whether the accessed files are local or remote.
Application programmers can use the same library calls to manipulate files on NFS.
Users of a web browser can navigate to another page by clicking on a hyperlink, irregardless whether the hyperlink leads to a local or a remote page.
Programmers of a database application can use the same SQL commands irregardless whether they are accessing a local or a remote database in a distributed relational database management system.

Location transparency means that data can be accessed without knowing the physical position of the data.
Examples
Users of the network file system can access files by name and do not need to know whether the file resides on a local or a remote disk.
The same applies to application programmers, who can pass file names to library functions and need not worry about the physical location of the files.
Users of a Web browser need not be aware where the page physically resides. They can access initially pages by a URL and then can navigate further by URLs that are embedded in the web page.
Programmers of a relational database application do not need to worry where the tables physically reside. They can access tables by table name and need not worry about where the table is physically located. Their local database monitor will translate the names into physical location and have remote monitors transferring tables if a remote table should be accessed.

(Concurrency Transparency)Enables several processes to concurrently access and update shared information without having to be aware that other processes may be accessing the information at the same time.
Examples:
Multiple users can access and update files on the same file system and they do not know about each other.
Concurrency is, however, not transparent for an application programmer using the file system. To avoid lost updates and inconsistent analysis, s/he has to explicitly lock files.
Users of an automatic teller machine need not be aware of the fact that other customers are using tellers at the same time and that bank clerks may be concurrently manipulating account balances.
Programmers of relational database applications typically need not worry about concurrency, but integrity against concurrent updates is typically preserved by the database management system (e.g. by two-phase locking).

Replication is the duplication of data on other hosts
Replication is used to increase the reliability of data accesses as well as the performance with which data is accessed and updated.
Replication transparency denotes the fact that neither users nor application programmers have to be aware about the replication of data.
Examples:
Tables in a distributed relational database may be replicated. However neither users, nor application programmers are aware that the tables are replicated and updates have to be propagated to the other replicas as well.
Often Web pages are replicated to increase performance of their retrieval and to have them available also in the presence of network failures. The SuperJanet gateway, for instance replicates pages from the US that are frequently accessed. Replication, however, is transparent for both Web surfers and Web page designers. A Web surfer does not see that the page is not being brought over the Atlantic (S/he may be surprised by the speed, however). A Web page designer can still refer to the US URL and need not take the mirror site into account.

Even though components in distributed systems may fail, it is important that users of the systems are not aware of these failures. Failure Transparency denotes this concealment of failures.
Failure transparency is rather difficult to achieve. It involves complete fault recovery.
As an example, consider the distributed database again. If the database has kept local replicates of remote data, users can continue to use the database, even though the remote data monitors have failed. The local data monitor has to detect the failure of remote monitors. Updates of local data then have to be buffered in the local replicate, inconsistencies have to be temporarily tolerated (as multiple sites may temporarily buffer updates). After the remote monitor has come up again, the buffered updates have to be incorporated into the remote databases and inconsistencies (if any) have to be reconciled.
Another example is typical cloud computing, such as Amazon Web Service (AWS), where millions of servers are employed for massive users. Faults/Failures are inevitable but customers would be not notice.

(Mobility Transparency)Migration denotes the fact that software and/or data is moved to other processors.
Migration is transparent to users and application programmers if they do not have to be aware of the fact that software and/or data has moved.
Migration transparency is dependent on location transparency.
Examples:
If CSE decides to move file systems of the NFS (or parts thereof) to a different disk, you will not recognize that.
If CSE moves the department Web site to a different location in the file system, you will not recognize that because the URL http://www.cse.cuhk.edu.hk will be interpreted by the local http daemon.

Performance Transparency denotes the fact that users and application programmers are not aware as to how the performance that a distributes system has is actually achieved.
Example: distributed configuration (such as distributed ‘make’)
Example: Hadoop which implements MapReduce.

Scalability denotes the fact that the distributed system can be adjusted to accommodate a growing load / number of users.
Scaling the distributed system up is transparent if users and application programmers do not have to be changed.
Examples:
New Web sites can be added to the Internet, thus scaling the Internet up without existing sites having to change their set-up.
New network connections can be added in the Internet or existing connections are replaced with higher bandwidth connections to improve throughput. Existing Web sites do not have to be changed to benefit from this improvement.
In a distributed database, new hosts can be added to accommodate parts of the database. The allocation tables maintained by database monitors will have to be adjusted. Existing database schemas and applications, however, need not be changed.

Name: names that can be interpreted by users or by programs
Identifier: names that can be interpreted or used only by programs. At each name translation step, a name or identifier is mapped to a lower-level identifier that can be used to specify a resource when communicating with some software component, until a communication id is produced that is acceptable to the communication subsystem, and that is used to transmit a request to a resource manager.
Names having some hierarchical structure
representing an internal hierarchic name space (/etc/passwd)
organizational hierarchy (cse.cuhk.edu.hk)
a flat set of numeric or symbolic identifier
advantages: each part of a name is resolved relative to a separate context, and the same name may be used with different meaning in different context
Names are always resolved relative to some context. Contexts are represented by name tables or databases. In the case of file systems, each directory represents a context. To resolve a name, we must supply the context and the name. A name service accepts requests for the translation of names or identifiers in one name space to identifier in some other space. It also handles name registration, deletion, and provides up-to-date information.
Naming schemes can be designed to protect the resources from unauthorized access. Each id is chosen so that it is hard to reproduce, and their client’s authority is being checked by the naming service. Ids which meet this requirement are known as capabilities.

Synchronization prevent sending or receiving process from continuing until the other process makes an action that frees it.
Each message-passing action involves the transmission by the sending process of a set of data values (a message) through a specified communication mechanism (a channel or port) and the acceptance by the receiving process of a message.
Synchronous (blocking) means that the sender waits after transmitting a message until the receiver has performed a receive operation.
Asynchronous (non-blocking) means that the message is placed in a queue of messages waiting for the receiver to accept them and the sending process can proceed immediately.
Distributed systems can be designed entirely in terms of message-passing, but there are certain useful communication patterns (collective of primitives for high-level operations).
Client-server communication model is for service provision: 1. Transmission of a request from a client to a server; 2. Execution of the request by the server; 3. Transmission of a reply to the client.
Function shipping: the server acts as an execution environment and interpreter for programs, and clients transmit sequences of instructions for interpretation (e.g., PostScript files sent to printer).
Multicasting: sending a message to the members of a specified group of processes. Multicasting examples: locating an object, fault tolerance, and multiple update.

(centralized software structure)Middleware provides run-time support for programming language, such as interpreters and libraries
OS is the main system software to manage basic resources and to provide user and application services:
Basic resource management:

  • memory allocation and protection
  • process creation and processor scheduling
  • peripheral device handling
    User and application services:
  • user authentication and access control (e.g., login facilities)
  • file management and file access facilities
  • clock facilities

(software structure in distributed system)Kernel only perform basic resource management with the addition of inter-process communication
A new class of software component called open services to provide all other shared resources and services.
Kernels are not designed to be modified routinely. Any services that do not require privileged access to the kernels’ code and data or the hardware of the computer need not be included in the kernel.
A shared horizontal line signified that services provided by the box below the line are directly used by the components above it. E.g., application program may use OS kernel services, distributed programming support and open services.
Micro-kernels: the kernels that provide the smallest possible set of services and resources on which the remaining services required can be built. Typically this basic set of services includes a process abstraction and a basic communication service.
Open services: The distinction between kernel services and open services is made because kernel cannot be open, as they must enforce their own protection against run-time modification. Openness means that distributed systems can be configured to the particular needs of a given community of users or set of applications.
Distributed programming support includes run-time support for language facilities that allow programs written in conventional languages to work together.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值