Design an Instagram


title: Notes of System Design No.07 - Design an Instagram
description: ‘Design an Instagram’
date: 2022-05-13 18:01:58
tags: 系统设计
categories:

  • 系统设计

01.Functional Requirement

  • the first one is going to be that we need to be able to upload images from a mobile client like a phone iOS or Android

  • and our second requirement is going to be that we need to allow users to follow other users and so they can you know see content from other people

  • we’d also like to generate a news feed or sort of a feed of images and display that to users when they visit the app and we maybe won’t get into sort of a super

  • complicated feed but at least thinking about sort of the API to request that feed and how we would yea ensure that that is is generated reliably

  • and then the last thing is we do want to think about scale

  • so let’s imagine that we’re working with or we want to design a system that’s going to scale up to support 10 million users

  • and so these are requirements

  • this is something I would sort of flesh out and discuss with the interviewer and make sure I understand sort of what each of these things means and sort of what their requirements would look like

  • so you could sort of dive into each of these things and get more specific around

  • like you know what is a mobile client and is that a native app or mobile web

  • and do we need to support other platforms

  • are we concerned about certain constraints like network or you know storage space on the device

02. Non-Functional Requirement

03. Assumptions


something I like to do the beginning of

thinking about a design for system is

ultimately 
what sort of scale do we need

our system to work at 

 so let's assume that we're talking

about 10 million users who are using our

service on a monthly basis 
and like 10

million active monthly users 
and let's

imagine that each of them is uploading

two photos per month 
that's how we

define you know this active user that's

our average case that we were shooting

for and
and let's say that each of those

photos

is going to be around five megabytes and
let's let's allow that to sort of

include some the metadata like the

caption as well as the location and

maybe some other photo metadata that we

want to include 
and we'll sort of we'll

ignore for now things like comments and

other types of things we might need to

store 
because that wasn't in sort of

explicitly in our requirements but

that's something we could also think

about later 
and yeah so let's let's

crunch these numbers a little bit 
so ten

million is 10 to the 7 times 2 times 5

megabytes so that's basically 10 to the

8th so that boils down to 100 million

megabytes which is hundred terabytes per

month
 so um you know times a year we're

going to be looking at 1.2 you thought

of lights data 
and so that's a lot of

that's a lot of data 
so obviously this

is and that's only going to grow over

time
 so this sort of doing this exercise

sort of informs just you know how much

traffic we're going to have
 what are

storage requirements I'm going to look

like 
and what sort of system should we

choose to support those 
and so since

we're talking about a lot of photo data

here as well as metadata

we know that

just sort of looking forward into our

system 

we're going to have to be

thinking about different ways to store

those types of data 
and how to do it in

a way that's reliable and efficient so

yeah so we'll let's look at that 

04. Define API && data model

  • user table

  • photo table

  • relational table

and the

next thing I would do before we get into

specific components of the system is I

like to sort of take a step back and

think about the API or the or the data

or data model of our system.

so in this

case let's start with the data model

because I think we know clearly which

things we need to store in this case 
and

and and then we can talk about the API a

little bit later
 so what I'm going to do

is I'm going to just create we know we

have three different data types


basically that we need to to model

there's the users

there's the photos they're posting and

then there's sort of this user following

model as well where we need to store

sort of this relationship between

different users



 so what I'm going to do

is I'm going to sketch out those three

sort of database tables

 and I'll also

talk about sort of the kinds of database


I would choose for these different types

of data and so let's look here first and

let's make this our user table


we'll make one for photos and we'll also

make one for followers so
so the first thing I want to I'm going

to talk about is our choice of database
so we have a lot of choices 

here and we

have multiple kinds of data so we can

store things in different ways

this is a what I view to be a

fundamentally relational type of data

problem
 because we know we have clear

types of data users photos as well as a

relational system between them
so we

have users are related to other users in

a many-to-many  of this

following mechanism
 so one user can

follow many users 

many users can be

followed by a user
 we also have photos which are

related to users in a in a many to one

way
 so one photo can only have one user

owner but a user can have many photos

and so there's this inherent sort of

relationship between the kinds of data

that we're trying to model 

and that to

me it lends itself to a relational

database obviously we could also store

things and buy relational database

and what I

would suggest is that you think about

whether your data is inherently

relational and whether you would benefit

from being able to to do relational

queries

 and in this case I think being

able to quickly get all of the you know

photos for a particular user 
that's a thing you would have to do on a very

regular basis and that's that's an

inherently serve relational query

pattern

and so I think choosing a sequel

database and  I'm gonna go ahead

and start modeling out these tables

based on sort of requirements we know we

have our system 
so in the user table and we're going to

have a primary ID
I'm just going to be

our primary key turning an integer and

it's many of you know increments 

so some other things will

probably need for a user with the name

to be a string 

we're going to have

probably an email address which will

also be a string 
we're going to have

perhaps like a location perhaps some

sort of you know other IDE or other

attributes about the person 

maybe time zone things like that and but

these are sort of I think the basic

things we need to get started here
 um so

the photo table we're also going to have

an ID and this is going to be again our

primary key and we're also going to have

in this case we're going to have a

foreign ID referencing the user 
so this

is going to be a foreign key referencing

that ID 
so that would be our

relationship between these two tables
and we're probably going to have a you

know maybe a a caption or description

for this first image 
and then we might

have additional made of metadata like

the location


 it was stored which might

be a string or coordinate type 


then lastly I mean one thing we're going

to have to do here is
 um because of the

large nature of these images we're not

going to store them 
some sort of path or URL which is going to be

a reference to our sort of distributed

file system that's going to actually

store and replicate and handle basically

everything related to storing these

files these images
 so that's sort of the

basic building blocks at that which is

going in
 and then lastly I'm going to

talk about sort of the following the

model for building followers 
so in this case it's actually pretty simple

we could actually just have two columns

here 
there's going to be user 

user one essentially and user two 

and these are both going to be foreign keys

referencing one user and the other user
and what this table does essentially is

it models

 it's just a relational table

that models one direction of someone

following
so user one can follow user two
 
without user two following user one
okay so now we have sort of our basic

data model in place 

05. High-Level Design


um let's talk a

little bit about the overall system and

the high-level components that we need

to sort of bring all this stuff together

keeping in mind the sort of scalability

and requirements we laid out at the

beginning as well as the features that

we need to support

 namely

having users upload photos having them

be able to follow other users and

getting to see this this new speed of

images that have been posted by their

their followers 


moving on to that let's

talk about each of these components so

at fundamental level

I'm gonna start

with the database since that's what

we're just talking about 

so and this is

our going to be our metadata database

and in addition and one of the important

pieces here is this idea of the image

paths that we're storing in the photo

table and where those are going to be

stored 


so we're also going to have in a

distributed object storage mechanism and

this is something like s3 

that's going to

store and replicate our data in a

reliable way and then we're going to

store the reference paths to files that

are stored there in our metadata

database 
and that way we can have

a separate reliable place to upload images and reference them

that will be accessible and fast


 essentially there's

sort of this connection between these

two in this in this way all right


so next what we'll be talking about is

our application service layers 

so we

haven't gotten into this yet but this is

going to be the actual sort of core of

our system 

you are going to be the servers that are

responsible for the so-called like crud

operations create read update and delete


so this is what sort of serves as the

the system that's going to be responding

to requests from clients whether mobile

or web at cetera and are going to be

handling those and then performing

operations on the database or getting

information and returning it to the user

so this is really the core part of any

web backend system 


so let's

think a little bit about the access

patterns that are probably going to

happen in in an Instagram 

like

application when we reach the scale of

10 million users what we expect to

happen here is there's going to be a lot

more people viewing and reading their

their feed 


then there are going to be

people uploading


 and we need to be able

to do both those operations efficiently

and support maybe different sort of

access patterns that are happening at

different peak times of the day


 you know

there's all these different sort of

usage patterns that we need to be aware

of that might affect the way to build

our system

so knowing that this is sort

of a read heavy system 


here we're going to

want to have and replicas of this to

read replicas of this database that will

allow us to efficiently read data from

them without essentially slowing down

our ability to also write data and

upload new images 

and and this will

allow us to achieve a higher scalability

of our system and just overall like a

larger throughput of request volume

since the database and like connections

to it are often one of the major

bottlenecks of any system 


um so we're

gonna go ahead and build our application

server here and what I'd like to do is

split this into maybe two different

services

 so let's think about like the

read services and the writing or

uploading services separately


 and since

as I mentioned they might have different

patterns different requirements they

might require different types of caching

and other things 
and so for example I

read we might implement caching here in

between the database and the app server

like an in-memory storage like Redis

this would allow us to return frequently

accessed data much much faster than

making a request to the database every

single time so yeah what's struck hash

here

there are several different caching

policies and I won't get into all the

different ways in which that works but

essentially we want to use sort of a

right through right back policy where

when a right behavior or an update

happens on some content that's in our

database 

we want to update our cache at

the same time so that downstream the

read service that's using this cache to

return data to users is updated in a

reasonable time 


 I'm just

gonna note that like this would probably

be some sort of distributed cache system

like Redis and that would be really

performance and sort of sit separately

from our from our main database 


so all

right so now we have two different

services here they're responsible for

different parts of our of our feature

set .


one is responsible for fetching

images and returning data the user

another is responsible for performing

this sort of upgrade this upload process

which would also not only send data into

our metadata database 

but also would handle actually uploading the image from

the client to the file storage system

 so we're also going to need to have

a load balancer so as any system scales

becomes very important that the sort of

fundamental pieces of the system can

support the load and volume requirements

of the number of users

are attempting to use it 

because

otherwise you'll begin dropping requests

or taking really long time to serve

users as the requests are of queue up

one of the most common patterns of

solving this sort of problem or solving

some of these problems is to scale

horizontally


so what we would do is we

would not have just one app server for

reading and writing 

we would have many

many servers perhaps located and

different access points throughout

throughout the world depending on how

large your application is 


 we'll probably have

a variety of mobile clients

who will be reaching our server through

the internet and and sending requests

that could be getting images could be

uploading images and so starting from

the client

they'll call our API which will have a

different route for each of these sort

of features that we want to support

our requests will come to our load

balancer which will be sort of the top

level on part of our system 

it will get

the request it will figure out which

service it belongs to and we'll round it

to that service or one of the instances

of that service and and then it will in

the read case 

it will figure out  does this user have

 it will do all the sort of business logic 
 
 that bring out you know what an image am I looking for


um in the upload case when a user is

creating a new image something similar

would happen where we would open a

connection to the server and it would

begin to I would upload all the metadata

related to you and user and the photo

that we're uploading and then it would

handle receiving that image and passing

it on and the object storage

and as we mentioned that process would

also and call the cache she uploaded so

that that new content could be available

to users during the press



so the one thing we haven't talked about

here is the process by which we would

generate sort of a feed of images for

each user 
and the way I would think

about  is that could in fact be a

process that happens and also gets

stored in either dispatch or another

feed cache and that process will be

managed by this read server or by a

separate service


it's called a feed generation service and

this service would and

access our cache as well as our database

and this would be operating perhaps

depending on the requirements of the

feed that we want would be operating on

a schedule

06. Low-Level Design

07 . Dive Deep

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
4S店客户管理小程序-毕业设计,基于微信小程序+SSM+MySql开发,源码+数据库+论文答辩+毕业论文+视频演示 社会的发展和科学技术的进步,互联网技术越来越受欢迎。手机也逐渐受到广大人民群众的喜爱,也逐渐进入了每个用户的使用。手机具有便利性,速度快,效率高,成本低等优点。 因此,构建符合自己要求的操作系统是非常有意义的。 本文从管理员、用户的功能要求出发,4S店客户管理系统中的功能模块主要是实现管理员服务端;首页、个人中心、用户管理、门店管理、车展管理、汽车品牌管理、新闻头条管理、预约试驾管理、我的收藏管理、系统管理,用户客户端:首页、车展、新闻头条、我的。门店客户端:首页、车展、新闻头条、我的经过认真细致的研究,精心准备和规划,最后测试成功,系统可以正常使用。分析功能调整与4S店客户管理系统实现的实际需求相结合,讨论了微信开发者技术与后台结合java语言和MySQL数据库开发4S店客户管理系统的使用。 关键字:4S店客户管理系统小程序 微信开发者 Java技术 MySQL数据库 软件的功能: 1、开发实现4S店客户管理系统的整个系统程序; 2、管理员服务端;首页、个人中心、用户管理、门店管理、车展管理、汽车品牌管理、新闻头条管理、预约试驾管理、我的收藏管理、系统管理等。 3、用户客户端:首页、车展、新闻头条、我的 4、门店客户端:首页、车展、新闻头条、我的等相应操作; 5、基础数据管理:实现系统基本信息的添加、修改及删除等操作,并且根据需求进行交流信息的查看及回复相应操作。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值