系统设计框架1-Design A Code-Deployment System

最新推荐文章于 2024-09-27 09:28:28 发布

StellaLiu萤窗小语

最新推荐文章于 2024-09-27 09:28:28 发布

阅读量543

点赞数

分类专栏： # 系统设计

本文链接：https://blog.csdn.net/anqi3776/article/details/114562743

版权

本文讨论了设计一个大规模代码部署系统的需求，该系统从合并到主分支的代码中构建二进制文件，并在全球范围内高效、可扩展地进行部署。系统需要在30分钟内完成构建和部署，并能在30分钟内将成功构建的二进制文件分发到全球的机器上。设计包括构建系统和部署系统两个主要部分，考虑了并发、可用性和扩展性等关键因素。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Design a global and fast code-deployment system.

在这里插入图片描述

Many systems design questions are intentionally left very vague and are literally given in the form of Design Foobar. It’s your job to ask clarifying questions to better understand the system that you have to build.

Questiones

Question 1

Q: What exactly do we mean by a code-deployment system? Are we talking about building, testing, and shipping code?

A: We want to design a system that takes code, builds it into a binary (an opaque blob of data—the compiled code), and deploys the result globally in an efficient and scalable way. We don’t need to worry about testing code; let’s assume that’s already covered.

Question 2

Q: What part of the software-development lifecycle, so to speak, are we designing this for? Is this process of building and deploying code happening when code is being submitted for code review, when code is being merged into a codebase, or when code is being shipped?

A: Once code is merged into the trunk or master branch of a central code repository, engineers should be able to trigger a build and deploy that build (through a UI, which we’re not designing). At that point, the code has already been reviewed and is ready to ship. So to clarify, we’re not designing the system that handles code being submitted for review or being merged into a master branch—just the system that takes merged code, builds it, and deploys it.

Question 3

Q: Are we essentially trying to ship code to production by sending it
to, presumably, all of our application servers around the world?

A: Yes, exactly.

Question 4

Q: How many machines are we deploying to? Are they located all over
the world?

A: We want this system to scale massively to hundreds of thousands of machines spread across 5-10 regions throughout the world.

Question 5

Q: This sounds like an internal system. Is there any sense of urgency in deploying this code? Can we afford failures in the deployment process? How fast do we want a single deployment to take?

A: This is an internal system, but we’ll want to have decent availability, because many outages are resolved by rolling forward or rolling back buggy code, so this part of the infrastructure may be necessary to avoid certain terrible situations. In terms of failure tolerance, any build should eventually reach a SUCCESS or FAILURE state. Once a binary has been successfully built, it should be shippable to all machines globally within 30 minutes.

Question 6

Q: So it sounds like we want our system to be available, but not necessarily highly available, we want a clear end-state for builds, and we want the entire process of building and deploying code to take roughly 30 minutes. Is that correct?

A: Yes, that’s correct.