CPT101-小豪的笔记

小豪GO!

已于 2023-01-11 00:29:32 修改

阅读量1.8k

点赞数 4

分类专栏： ICS的学生培育法（XJTLU）文章标签： java

于 2022-10-08 00:01:51 首次发布

本文链接：https://blog.csdn.net/qq_62123793/article/details/127201335

版权

ICS的学生培育法（XJTLU）专栏收录该内容

9 篇文章

订阅专栏

CPT101

文章目录

1. Overview

1.1 Overview and history of computer architecture

Types of Computers

Mainframe computers (1960s)
Supercomputers (1970s)
Workstations (1980s)
Microcomputers (1980s)
Personal computers (1980s)
Microcontrollers（单片机） (1980s)
Servers (1980s)
Chip computer (?)

Computer Generations

First (1944 to1958): vacuum tube(真空管;电子管)
Second (1959 to1963): transistor(晶体管)
Third generation (1964 to 1970): IC 集成电路（integrated circuit）
Fourth generation (1971 to now: VLSI) 超大规模集成电路（very large-scale integration）

1.2 Computer Systems

Computer Hardware

5 categories
- input
- processing
- output
- storage
- communications

Computer Software

System software
Applications software

Backward (Downward) Compatibility for new hardware

Most software written for computers with old hardware can be run on computers with newer hardware

VDHL 极高速集成电路硬件描述语言（Very High Speed Integrated Circuits Hardware Description Language）

A programming language to be used to specify both the structure and function of hardware circuits.
Supports computer simulations as well as providing input to automatic layout packages which arranges the final circuits.

Hierarchy of Systems

Operating System

Functionalities of hardware systems can be brought out by operating systems and thus offered to the user.
The user’s programs interact with hardware systems through the functionalities provided by operating systems.

Hardware evolution: Moore’s Law

A circuit designed 24 months ago can now be shrunk to fit into an area of half the size.
- It is sometimes quoted as every 18 months.

2. Input-Process-Output Model

###2.1 Input-Process-Output

There are three components required for the implementation of Input-Process-Output and von Neumann model(s):

Hardware.
Software.
Data that is being manipulated.

Hardware

Central Processing Unit (CPU) is an active part which performs calculations and other operations.
The main memory (primary storage or working storage), or RAM (for random access memory) holds data and programs for access by CPU.
Memory is volatile…
The secondary storage.
- Long-term storage.
- Holds programs and data.
- Hard disk, CDs, DVD, etc.
Input devices: keyboard, mouse, scanner, etc.
Output devices: monitor, speaker, printer, etc.

Software

The hardware of a computer (e.g. CPU) can carry out only very simple operations like adding numbers (very quickly).
To make it perform useful tasks, these simple steps are combined in the form of programs, which are collectively known as software.

Machine instructions

The CPU performs the execution of machine instructions.
Every CPU has its own instruction set (100- 200 instructions, typically).
- For a particular machine, this set is fixed.
Although the instruction sets of different CPUs are similar, there is no standard instruction set.

Machine Instruction Categories

Input-output: IN, OUT (Intel x86 and Pentium, but does not exist in some CPUs), …
Data transfer and manipulations: MOV, ADD , MUL , AND , OR , …
Transfer of program control: JMP, JC, …
Machine control: can halt processing, reset the hardware, INT, HLT …

Machine Instructions and HLL

High Level Programming Languages (HLLs) are more suitable for programming than the languages of machine instructions.
The programs in HLL still have to be translated to the machine codes.

2.2 The von Neumann Model

The idea was formulated by von Neumann (late 1940s).
- The computer is a general-purpose machine controlled by an executable program.
In this context:
- A program is a list of instructions used to direct a task.
- Both program and data are held in computer’s memory (store) and both represented by binary codes.
- The fact that memory is re-writeable makes a von Neumann machine especially powerful.
- A processor is an active part of the machine that executes the program instructions.

Input device is for transmitting information from a user into the computer’s memory.
Output device enables a user to see results of the program being performed.
Von Neumann bottleneck.
- CPU is continuously forced to wait for vital data (and instructions) to be transferred to or from memory.

Harvard architecture

Separates data from programs.
Requires different memories and access buses for programs and data.
The intention is to increase transfer rates, improving throughput.

3. Machine instructions and HLL

Semantic gap

The term expresses the enormous difference between the way human languages expressing ideas and actions and the way computer instructions representing data processing activities.

###3.1 Translation

Translation is done by special programs such as：
- Compilers, translating HLL instructions into machine code (sequence of instructions) before the code can be run on the machine.
- Assemblers, translating mnemonic form of machine instructions (like MOV, ADD, etc) into their binary codes.
- Interpreters, translating HLL instructions into machine code on-the-fly (while the program is running).

compilers and assemblers

Linking

Big programs usually are divided into several separate parts or modules.
Each module has to be designed, coded and compiled.
There are frequent occasions when code in one module needs to reference data or subroutines in another module.
A compiler can translate a module into binary codes, but it cannot resolve those references to other modules.
Those external references remain symbolic after the compilation, until the linker gets to work.
- The linker is to join together all the binary parts.
- The linker will report errors if it cannot find the module or code referred to by those external references.

Library files

Translated object code.
Provide many functions for programmers, but are only usable if linked into your code.
In Unix:
- Directories /lib and /usr/lib/.
In Windows:
- DLL files.

3.2 Interpreters

alternative way of running HLL programs

Instructions are converted into an intermediate form, consisting of tokens.
- In Java, tokens such as: static, boolean, file, string, void, return
Tokens are then passed to the decoder, which selects appropriate routines for execution.
Compilers.
- Take a program and translate it as a whole into machine code.
- The processes of translation and execution are separate.
Interpreters.
- Take an instruction, one at a time, translate and execute it.
- The processes of translation and execution are interlaced.

C program compilation, linking & execution

C language source code --> compiler ( program) --> assembly language --> assembler --> machine code
Once we have machine code:
machine code --> linking and loading (program) --> program code execution (program)

Java

Java source code --> compiler (program) --> Java “byte codes” --> Java interpreter (program)

Interpreters vs. Compilers

Execution of compiled code is much faster than execution of interpreted code.
Interpreters are more suitable for rapid prototyping and for other situations when a program is frequently modified.
- Interpreters are more accurate in terms of error reporting.
- Interpretation can provide uniform execution environment across several diverse computers. (Portable)

Interpreters as Virtual Machines

Interpreters are somewhat similar to the computer hardware (CPU)
- take one instruction at a time and execute it.
Because of that sometimes they are referred to as a virtual machine
- Example: JVM, Java Virtual Machine

###3.3 Code sharing and reuse

How to reuse existing proven software when developing new systems?
- Source-level subroutines and macro libraries.
- Pre-translated re-locatable binary libraries.
- Dynamic libraries and dynamic linking.

How code can be shared

Source-level subroutines and macro libraries

Intention.
- Take copies of the library routines.
- Edit them into your new code.
- Translate the whole together.
Disadvantages.
- Who owned the code?
- Who should maintain it?

Pre-translated relocatable binary libraries

Intention.
- Libraries are pre-translated into relocatable binary code.
- Can be linked into your new code, but not altered.
Acceptance.
- Successful, and still essential for all software development undertaken today.
Disadvantage.
- Each program is to have a private copy of the subroutines, wasting valuable memory space, and swapping time, in a multitasking system.

Dynamic libraries and dynamic linking

Intention.
- Load a program which uses “public” routines already loaded into memory.
- The memory-resident libraries are mapped, through the memory management system to control access and avoid multiple code copies.
Acceptance.
- Successful through Microsoft’s ActiveX standard.

4. Data, Information and Knowledge

Data – raw facts, figures, measurements, …
- 1.00001, 1.00000010, 2.0000101, 3.0000102, 5.000…
Information
- data organized into useful representation
- 1.000, 1.000, 2.000, 3.000, 5.000, …
Knowledge
- application of reasoned analysis of information
- ‘data are in increasing order’, ‘data can be derived based on Fibonacci sequencing’, etc

4.1 Alphanumeric codes

The majority of the data originally comes in the form of letters in alphabet, numbers and punctuation (alphanumeric data).
- They are represented in computers by binary numbers.

bit

A bit is the most basic unit of information possible: it contains the information necessary to distinguish two alternatives (1 or 0, YES or NOT, etc.).

ASCII code ( American Standard Code for Information interchange (7-bit code) and its extensions (8-bit codes) (well-established).
EBCDIC code ( Extended Binary Coded Decimal Interchange Code) 8-bit code. (IBM mainframe computers)
Unicode. Recent 16-bit standard. (Up to 216 characters can be encoded)

ASCII code table

Only half of possible byte (8-bits) patterns is used.
The table is divided into two classes of codes:
- Printing characters.
- Control characters.
Printing characters produce output on the screen or on a printer.
Control characters are used:
- To control the position of output on the screen or paper (e.g. ‘HT’).
- To cause some action to occur (e.g. ‘BEL’).
- To communicate status between the computer and an I/O device (e.g. ‘Control-C’ combination).

Limitations of ASCII code

The limitations of the well-established 8 bit ASCII codes.
- Too limited for the display requirements of modern Windows-based word-processors.
- The requirement of global software market for handling international character sets.

Unicode

Even 8-bit extensions of ASCII code table is capable to code only up to 256 characters.
Unicode Standard (1991) is an 16-bit international encoding system for information interchange.
- Code values are available for more than 65,000 characters.

Representation of numbers

Two’s complement(补码) as a method of representing and manipulating negative integers.

5 = 00000101
-5 = 11111011

Representation of real numbers

IEEE 754 standard.
- The most widely-used standard for floatingpoint computation.
- defines formats for representing floating-point numbers, special values, and a set of floatingpoint operations that operate on these values.

Declaration of variables in programs

What happens when you declare variables in a program?
- You are telling the compiler to reserve the correct amount of memory space to hold the variable.
- You are also telling the compiler what encoding/decoding/representation scheme to be used.

5. Operating systems

5.1 Operating systems: examples

OS/360 for IBM System/360, 1960s.
Unix, 1970s.
MS-DOS for IBM PC and Mac OS for Apple Macintosh, 1980s.
Windows 95, 98, NT, 1990s.
- NT served as the basis for Microsoft’s desktop operating system line starting in 2001.
Apple rebuilt their operating systems on top of a Unix core as Mac OS X, released in 2001.
Linux, BSD Unix…

5.2 Onion ring model

Core of operating system: dealing directly with the hardware.
- Kernel: device drivers, memory allocator…
- CLI(命令行界面) : provide user accessibilities to the system

5.3 Interaction with operating system

CLI (command line interpreter.)
- DOS: type a command in a command line.
- Unix/Linux: shell scripts (sequences of instructions).
- Windows/Mac OS X: click with mouse on icons.

Computer Networks

Perhaps the most far-reaching changes ever produced to von Neumann’s original blueprint.
Operating system usually provides access to network facilities. (via networking API, e.g. socket interface)
Computer network is an interconnected collection of autonomous computers to facilitate fast information exchange.

5.4 Client-server computing

Client(客户端): The originator of a request.
Server(服务器): The supplier of the service.

Client-server interaction

Client starts the interaction by sending a request message to the server.
Server responds by sending replies back…

6. Principal components of a computer

These are the minimum set of components for a working digital computer.

6.1 Motherboard

Three principal subsystems:
- CPU
- main memory
- input-output units

###6.2 Processor and Registers

Processor
- arithmetic/logic unit (ALU) (运算器)
- control unit(控制单元): part of a CPU responsible for performing the machine cycle-fetch, decode, execute, store
Registers
- Program counter (PC)(程序计数器): contains the address of the next instruction to execute
- Instruction register (IR)(指令寄存器): part of a CPU control unit that stores an instruction

Coprocessors: Assistants to the CPU

Coprocessors: microprocessors performing specialized functions that CPU cannot perform or cannot perform as well and as quickly
- math
- graphics

6.3 Buses

On the motherboard, all the components are interconnected by buses (“signal highways”).
A bus is a bundle of conductors, wires, or tracks.
Typically, there are address, data and control buses, each including several signal lines.
- Intel 8086: 20 shared address/data lines, and a further 17 lines for control.
- Intel Pentium: data bus 64 lines, and the address bus 32 lines.

Each hardware unit is connected to all these buses.
- A simple way of building up complex systems in which each unit can communicate with each other.
- Little disruption when plugging in new units and swapping out failed units.

6.4 Two parts of CPU

6.5 Registers

CPU registers: small block of fast memory.
- Temporarily store for data and address variables.
Some CPU registers:
- Instruction Pointer (IP) or Program Counter (PC).
  - Stores the address of the next instruction.
- Accumulator (AX, EAX in Pentium).
  - General purpose data register.
- Instruction Register (IR).
  - Stores the instruction that is being executed.
Memory address register (MAR).
- Temporarily holds address of the memory location during a bus transfer.
MBR

6.6 Instruction Set

The collection of machine language instructions that a particular processor understands
machine language instructions
- instructions for a specific CPU
- designed to be executed by a computer without being translated
- Also called machine code
- Operations like: ADD, SUB, INC, DEC, etc.

How instructions are executed?

The basic operation, known as the fetchexecute cycle or machine cycle.
- The sequence whereby each instruction of the program is executed:
  - Read from the memory.
  - Decoded.
  - Executed.

Machine Cycle

Fetch the instruction from memory. This step brings the instruction into the instruction register, a circuit that holds the instruction so that it can be decoded and executed
Decode the instruction
[Read the effective address from memory if the instruction has an indirect address ]
Execute the instruction
[Store the results]

The fetch phase of the cycle

The address in IP register is copied onto the address bus and further to MAR register.
IP is incremented ready for the next cycle. IP now points to the next location in the program memory.
Memory selects location and copies the content onto the data bus.
CPU copies the instruction code from the data bus into IR.
Decoding of the instruction starts.

• A Pentium instruction: 10111000 00000000 00000001 • Assembly code: MOV AX 0x100

Note that the content of a memory cell is different from its address (not shown in the figure).

The execution phase of the cycle

Execute phase depends on the type of instruction.
Example: the execution of MOV AX,256 instruction includes:
- IP is copied to address bus and latched into memory.
- IP is incremented.
- The value selected in memory is copied onto the data bus.
- CPU copies the value from the data bus into AX.

6.7 CISC & RISC

CISC (“sisk”)
- complex instruction set
- most mainframes and PCs
RISC (“risk”)
- reduced instruction set (精简指令集)
- cheaper and faster
- shift some work to software

CISC vs RISC

In RISC an instruction usually consists of a single word but in CISC an instruction may be several words long, requiring several fetches

RISC is faster because …

The vacated area of the chip can be used to accelerate the performance of more commonly used instructions, rather than compensating for those rarely used instructions
Easier to optimize the design
Simplifies translation from high-level languages into the smaller instruction set that the hardware understands, resulting in more efficient programs

7. Hardware

7.1 Output Hardware

Hardcopy output
- graphics
- letters
Softcopy output
- video
- audio

7.2 Screen Clarity

Standard screen resolutions
- 640 x 480
- 800 x 600
- 1024 x 768
- 1280 x 1024
- 1600 x 1200

7.3 Communications Hardware

Facilitate networks
- modems
- hubs and other components of a network

7.4 Ports

connecting peripherals to the computers

Parallel port (IEEE 1284)
- printers, some scanners
Serial port (RS-232)
- modems, scanners, mice

7.5 USB (Universal Serial Bus)

USB
- industry standard developed in the mid-1990s that defines the cables, connectors and protocols used for connection, communication and power supply between computers and electronic devices
- standardized the connection of computer peripherals, such as keyboards, pointing devices, digital cameras, printers, portable media players, disk drives and network adapters to PCs
- replaced earlier interfaces, such as serial and parallel ports, as well as separate power chargers for portable devices

7.6 Connectors

7.7 Power supply

Power supply
- protected by power surge protector or
- uninterrupted power supply unit (UPS)

8. Data codes – numeric and character

To store numbers we need an encoding scheme, which would allow us to encode:
- The algebraic sign of numbers (+/-).
- Decimal point that might be associated with a fractional number.

8.1 Unsigned integers: BCD

Each decimal digit is individually converted to binary.

This requires 4 bits per digit (not all 4-bits patterns are used).

8.2 Sign-and-magnitude representation

It is representation of signed integers by a plus or minus sign and a value.
Agreement. – Left-most bit represent a sign,
- e.g., 0 stands for + and 1 stands for -.
- 8-bits can represent the numbers from -127 to 127 (0 being represented twice).

8.3 10’s complementary coding

8.4 Floating Point Numbers

Single-precision 32 bit IEEE 754

Double-precision 64 bit IEEE 754

9. Data storage

Storage is the capacity of a device to hold and retain data.
Two main types of storage in a computer:
- Main memory.
- Mass storage.

9.1 Main memory

It refers to physical memory that is internal to the computer.
The computer can manipulate only data that is inside the main memory.

9.2 RAM

随机存取器 (random access memory)

The memory can be seen as a set of numbered storage elements, called words, each of which contains some information.
Each word is numbered with its address.
Any word of memory can be accessed “without touching” the preceding words (Random Access).
Access time is the same for all the stored items.

Dynamic RAM (DRAM)

Cheaper, but slower.
Implemented via capacitors.
DRAM needs to be refreshed.

Static RAM (SRAM)

Faster, but more expensive.
Implemented via flip-flops.
No need for refreshing.

Both types of RAM are volatile

They lose their contents when the power is turned off.

9.3 ROM

Read-only memory (ROM)
Software stored inside also known as firmware
Helps boot up the system
BIOS – Basic Input Output System

9.4 Other Forms of Memory

Cache memory

quick access memory, internal or external to the processor
bridge between the processor and RAM
including simultaneous read/write

Video memory

VRAM

Mass storage

It refers to various techniques and devices for storing a large amount of data.
Unlike main memory, mass storage devices retain data even when the computer is turned off.

Types of mass storage

Hard disks.
Optical disks: CD-ROM, CD-RW, DVD, etc.
USB disks, Floppy disks.

Hard Disk Drives (HDD)

硬盘驱动器

Hard disk drives are the most important types of permanent storage used in computers (esp. PCs).
Hard disks differ from the other mass storage devices in three ways:
- Size (usually larger).
- Speed (usually faster).
- Permanence (usually fixed in computer and not removable).

10. Memory

Any memory location in main memory has its own address.
It follows then the more memory the larger addresses are needed.
Maximal memory length depends on address width

10.1 Address width

Address width is determined by:
- The number of bits in the CPU address registers such as IP, MAR.
- The number of lines in the address bus.

10.2 Memory modules

###10.3 Memory mapping

When the CPU sends out an address:
- A part of the address locates the correct chip.
- Another part specifies an address within the correct chip.
How actually the addresses are mapped to the memory locations is defined by memory maps.

Memory map for a small system

10.4 Memory address decoding

Memory chips are not normally matched to the width of the address bus. For example:
- CPU may send 32-bit address.
- RAM may receive directly 24-bit address.
Special Memory Address Decoding circuit implements necessary decoding

10.5 Registers

Registers are the memory cells which are core part of the processor itself.
It has very fast access (a few nanoseconds).
Not that much memory: tens of 32-, 64-, 80- bit registers (typically).

10.6 Cache memory

A memory (more expensive, but faster SRAM) placed between CPU and main memory.
Contains a copy of the portion of main memory.
The aim is to maintain in fast cache the currently active sections of code and data.
Processor when needs some information first checks cache.
If not found in cache, the block of memory containing the needed information is moved into the cache.

Levels of cache

Typically Level 1 cache has the size 8-64 KB.
Typically Level 2 cache has the size 128-512KB

10.7 Localisation of access

The idea of cache memory exploits Localisation of Memory Access principle:
- Computers tend to spend periods of time accessing the same locality of memory.
- A portion of code or data which require access needs to be loaded into the fastest memory nearest to CPU.
- Other sections of the program and data can be held in readiness lower down the memory hierarchy.

Why localisation of access works?

Partly due to the programmer clustering related data items together in arrays or records.
Partly due to the repeating patterns in a program (i.e. loops)
Partly due to the compiler attempting to organise the code in an efficient manner.

10.8 Cache memory and cache control unit

10.9 Memory hierarchy

Going down the hierarchy:
- Increased capacity.
- Increased access time.
- Decreased frequency of access of the memory by the processor.
- Decreased cost per bit.

11. Hard disk drives

Hard disk drives are the most important type of permanent storage used in computers (esp. PCs).

Schematic diagram of hard disk

Storage Technology

Retrieving files into RAM is called reading
- loading an application
- opening a file
- files can be programs or documents
Copying data from RAM onto a secondary storage device is called writing

11.1 Virtual memory

Virtual memory is a technique, in a sense, opposite to caching:
- It is the use of low-level memory (i.e. hard disk) to ‘expand’ high-level (main) memory.
- It provides a convenient expansion of main memory by ‘overflowing’ data and program code onto magnetic disk.
The area on disk reserved for this purpose is known as the swap area.

11.2 Memory Management

Virtual memory
- hard disk space
- when processor needs more RAM space, swaps unused data onto designated hard disk space
- improves flexibility but is slower than RAM to which the processor has direct access