系统设计个人小结-CSDN博客

本文链接：https://blog.csdn.net/roufoo/article/details/134658378

一些要记得的数字

关于时间
1天 => 86400秒
0.1% 的停机时间（即 99.9% 的可用性）就是一年停机时间8.77 小时。
0.01% 的停机时间（即 99.99% 的可用性）就是一年停机时间52.6 分钟。

关于存储
跟存储有关的K是1024，其它都是1000。下面的数字忽略1000和1024的区别。
1T => 10^12
1G => 10^9
1B => 10^9
1M => 10^6
1K => 10^3

一张hard disk 的MTTF (mean time to failure) 是10到50年。一个storage cluster (10000张硬盘)，可以估算平均大约1天坏一张硬盘。

关于网页
1个网页大小按10K估算。下载一个网页按2s估算。
An URL length 可以估算为100 bytes.

关于数据库
MySQL 读写效率单机约1KQPS
MongoDB - 读写效率单机约10QPS
Cassandra - 读写效率单机约10QPS
Redis - 内存数据库 Key-Value 读写效率单机约100KQPS
Memcached - 内存数据库单机约1MQPS ?

假设一个数据库采用B+树，page=4KB, 4层，branching factor=500，可以存储256TB。

关于视频
参考这个链接 http://www.xnbuy.com/news/01/106.html

比特率大小×摄像机的路数=网络带宽至少大小；
CIF视频格式的所需带宽：512Kbps(视频格式的比特率)×50(监控点的摄像机的总路数之和)=Kbps=25Mbps(下行带宽)
即：采用CIF视频格式监控中心所需的网络下行带宽至少25Mbps
D1视频格式的所需带宽：1.5Mbps(视频格式的比特率)×50(监控点的摄像机的总路数之和)=75Mbps(下行带宽)
即：采用D1视频格式监控中心所需的网络下行带宽至少75Mbps
720P(100万像素)的视频格式的所需带宽：2Mbps(视频格式的比特率)×50(监控点的摄像机的总路数之和)=100Mbps(下行带宽)
即：采用720P的视频格式监控中心所需的网络下行带宽至少100Mbps
1080P(200万像素)的视频格式的所需带宽：4Mbps(视频格式的比特率)×50(监控点的摄像机的总路数之和)=200Mbps(下行带宽)
即：采用1080P的视频格式监控中心所需的网络下行带宽至少200Mbps

存储空间计算：
码流大小(单位：kb/s；即：比特率÷8)×3600(单位：秒；1小时的秒数)×24(单位：小时；一天的时间长)×30(保存的天数)×50(监控点要保存摄像机录像的总数)÷0.9(磁盘格式化的损失10%空间)=所需存储空间的大小(注：存储单位换算1TB=1024GB；1GB=1024MB；1MB=1024KB)
50路存储30天的CIF视频格式录像信息的存储空间所需大小为：64×3600×24×30×50÷0.9=8789.1GB≈9TB
50路存储30天的D1视频格式录像信息的存储空间所需大小为：192×3600×24×30×50÷0.9=.2GB≈26TB
50路存储30天的720P(100万像素)视频格式录像信息的存储空间所需大小为：256×3600×24×30×50÷0.9=.3GB≈35TB
50路存储30天的1080P(200万像素)视频格式录像信息的存储空间所需大小为：512×3600×24×30×50÷0.9=.5GB≈69TB

算法和系统设计的对应

算法里面用queue, 系统设计就用message queue (Redis, Kafka, RabbitMQ等)，因为要持久化。
算法里面用hash set，系统设计就用KV的NonSQL数据库。

通常系统设计第一要求的是效率高，即延时低。而精度还是其次。
batching processing system such as Hadoop cares about throughput - the number of records we can process per second, or the total time it takes to run a job on a dataset of a certain size
online system cares more about the response time - the time b/w a client sending a request and receiving the response.

B-Tree 和 LSM-Tree对比

B-tree读快，LSM-Tree写快。
B-tree不好压缩, LSM-Tree易压缩

关于前端

HTML好比身体, CSS好比装饰用的衣服, JavaScript好比做什么动作。

RESTful API:
下面内容源自 https://javaguide.cn/system-design/basis/RESTfulAPI.html
GET：请求从服务器获取特定资源。举个例子：GET /classes（获取所有班级）
POST：在服务器上创建一个新的资源。举个例子：POST /classes（创建班级）
PUT：更新服务器上的资源（客户端提供更新后的整个资源）。举个例子：PUT /classes/12（更新编号为 12 的班级）
DELETE：从服务器删除特定的资源。举个例子：DELETE /classes/12（删除编号为 12 的班级）
PATCH：更新服务器上的资源（客户端提供更改的属性，可以看做作是部分更新），使用的比较少，这里就不举例子了。

web app和mobile app的区别
web app: 用Java/Python来handle business logic和storage，用HTML和Java script for presentation. 需要active Internet。
mobile app: 用JSON API，和web server之间用HTTP通信。不需要active Internet (可以offline)。

关于Cloud
Amazon EC2 是server，可以作为web server和data base server。
Amazon S3只提供存储服务，通常用于存储大型二进制文件。

关于协议和公司

Relational Database:
MySQL:
PostgreSQL:
SQL Server: Microsoft
Oracle: Oracle
NoSQL Database:
MongoDB: Amazon
Cassandra: Facebook
Redis
Memcached
HBase: Google
BigTable: Google
TAO: Facebook开发的社交图谱数据库(Graphic database)
Expresso: Linkedin 开发的document database
File System:
EC2: Amazon
Message Queue (中间件Middleware):
TIBCO:
WebSphere: IBM
webMethods:
RabbitMQ:
ActiveMQ
HornetQ:
(Apache) Kafka
Binary encoding libraries (serialization):
Protocol Buffers (protobuf): Google
(Apache) Thrift: Facebook, open source now.
(Apache) Avro
Web Service:
Ajax:
Microservices
Dubbo: Alibaba
Spring Cloud:
Coordination Service
Zookeeper

Acronyms

2PC: two-phase commit
2PL: two-phase locking
3PC: three-phase commit
ACID: atomicity, consistency, isolation and durability.
AJAX: Asynchronous JavaScript and XML
BASE: Basically Available, Soft state and Eventual consistency.
CAP: Consistency, Availability, Partition Tolerance
CORBA: Common Object Request Broker Architecture
COW: copy-on-write
CRUD: CREATE, READ, UPDATE and DELETE.
CSS: Cascading Style Sheets (CSS) is a stylesheet language used to describe the presentation of a document written in HTML or XML.
DCOM: Distributed Component Object Model
EJB: Enterprise JavaBeans
ETL: Extract-Transform-Load (used in data warehouse)
gRPC: google RPC
JDBC: API
LSM: Log-Structured Merge Tree
MESI: //for cache coherence b/w multiple cores.
MVCC: Multi-Version Concurrency Control
ODBC: API
OLAP: online analytic processing
OLTP: online transaction processing
OTA: (firmware) over-the-air update
RDF: Resource Description Framework (a mechanism for different websites to publish data in a consistent format).
RDS: Relational Database Service
REST: Representational State Transfer. Based on HTTP.
RESTful: APIs based on REST
RMI: Java’s Remote Method Invocation
RPC: remote procedure call
SIMD: Single Instruction Multi Data
SOA: Service Oriented Architecture. a.k.a microservices
SOAP: Simple Object Access Protocol。基于XML
SSI: Serializable Snapshot Isolation
SSL:
SST: Sorted String Table
TLS:
WAL: write-ahead log
WSDL: Web Services Description Language

一些英汉对应：
复合索引 composite index
倒排索引 inverted index
列式存储 columnar or column-based
反向代理 reverse proxy

关于CAP:
CP - 更严格，银行多用
AP - 比较实际。Availability可以通过Sharding和Replication来加强。
CA - 不可能

Availability - 解决办法有Replication, Sharding, gossip protocol (detect failure)。

实现High Availability Read: Data replication, multi-datacenter setup
实现High Availability Write: Versioning and conflict resolution with vector clocks, i.e., [server, version]
注意：Replication会增强Availability，但是会造成inconsistency！
我的理解是Availability和Consistency包括了Scalability - sharding, replication等 (consistent hashing)。
Web Server的scaling - 跟sharding 没关系，因为web server本身是stateless，里面的代码都是一样的，可以随意添加和删除。
Database的scaling - 可以通过replication和sharding来实现。

Consistency - 解决办法有Quorum consensus, versioning and vector lock 等。

一些Issue的解决方案：

Failure detection - 分布式系统里面的Failure detection 可以通过Gossip protocol (基于Heartbeat)来解决。
temporary failures - 通过sloppy quorum (hinted handoff)来解决。
permanent failures - 通过Merkle tree (hash tree)来解决。
data center failures - 通过replicate data across multiple data centers来解决。
clock synchronization - 可以通过Network Time Protocol来解决
Race condition - 解决办法是用原子操作。可以选Lock, 或Lua script and sorted sets data structure in Redis。
快速查找 - 1) 可以用cache; 2)可以用bloom filter，在KV数据库和短网址系统里面都可以用到。
KV数据库 - 用bloom filter查找key是不是在某个SSTable中。注意每个SSTable都有一个bloom filter
短网址系统 - 用bloom filter查找某个短网址是不是已经存在
防止大量请求 - Rate limiter
非实时的任务可以用cron job 周期性完成
重复内容可以通过Hash和Checksum来检测
Spider traps可以通过设置URL最大长度来避免
Message Queue 防止重发？在消息里面加一个event ID (或sequence ID)
长链接可以通过Http 1.1或websocket实现。Http1.1默认长链接，connection的值是keep-alive，但只能由客户端发起。websocket则客户端和服务器都可以发起。