CHAPTER 8: DESIGN A URL SHORTENER

Step 1 - Understand the problem and establish design scope

Candidate: Can you give an example of how a URL shortener work?
Interviewer: Assume URL
https://www.systeminterview.com/q=chatsystem&c=loggedin&v=v3&l=long is the original
URL. Your service creates an alias with shorter length: https://tinyurl.com/ y7keocwj. If you
click the alias, it redirects you to the original URL.
Candidate: What is the traffic volume?
Interviewer: 100 million URLs are generated per day.
Candidate: How long is the shortened URL?
Interviewer: As short as possible.
Candidate: What characters are allowed in the shortened URL?
Interviewer: Shortened URL can be a combination of numbers (0-9) and characters (a-z, AZ).
Candidate: Can shortened URLs be deleted or updated?
Interviewer: For simplicity, let us assume shortened URLs cannot be deleted or updated.
Here are the basic use cases:
1.URL shortening: given a long URL => return a much shorter URL
2.URL redirecting: given a shorter URL => redirect to the original URL
3.High availability, scalability, and fault tolerance considerations

Back of the envelope estimation

• Write operation: 100 million URLs are generated per day.
• Write operation per second: 100 million / 24 /3600 = 1160
• Read operation: Assuming ratio of read operation to write operation is 10:1, read
operation per second: 1160 * 10 = 11,600
• Assuming the URL shortener service will run for 10 years, this means we must support
100 million * 365 * 10 = 365 billion records.
• Assume average URL length is 100.
• Storage requirement over 10 years: 365 billion * 100 bytes * 10 years = 365 TB
It is important for you to walk through the assumptions and calculations with your
interviewer so that both of you are on the same page.

Step 2 - Propose high-level design and get buy-in

API Endpoints

1.URL shortening. To create a new short URL, a client sends a POST request, which contains
one parameter: the original long URL. The API looks like this:
POST api/v1/data/shorten
• request parameter: {longUrl: longURLString}
• return shortURL
2.URL redirecting. To redirect a short URL to the corresponding long URL, a client sends a
GET request. The API looks like this:
GET api/v1/shortUrl
• Return longURL for HTTP redirection

URL redirecting

在这里插入图片描述
Figure 8-1 shows what happens when you enter a tinyurl onto the browser. Once the server receives a tinyurl request, it changes the short URL to the long URL with 301 redirect.
在这里插入图片描述

301 redirect. A 301 redirect shows that the requested URL is “permanently” moved to the long URL. Since it is permanently redirected, the browser caches the response, and subsequent requests for the same URL will not be sent to the URL shortening service. Instead, requests are redirected to the long URL server directly.
302 redirect. A 302 redirect means that the URL is “temporarily” moved to the long URL, meaning that subsequent requests for the same URL will be sent to the URL shortening service first. Then, they are redirected to the long URL server.

The most intuitive way to implement URL redirecting is to use hash tables.

URL shortening

在这里插入图片描述
• Each longURL must be hashed to one hashValue.
• Each hashValue can be mapped back to the longURL.

Step 3 - Design deep dive

Data model

In the high-level design, everything is stored in a hash table.
A better option is to store <shortURL, longURL> mapping in a relational
database.
在这里插入图片描述

Hash function

The hashValue consists of characters from [0-9, a-z, A-Z], containing 10 + 26 + 26 = 62 possible characters. To figure out the length of hashValue, find the smallest n such that 62^n ≥ 365 billion.
在这里插入图片描述
在这里插入图片描述
even the shortest hash value (from CRC32) is too long (more than 7
characters). How can we make it shorter?
在这里插入图片描述
still too expensive, use bloom filter(method to test if something is in a set)

Base 62 conversion

在这里插入图片描述

在这里插入图片描述

URL shortening deep dive

在这里插入图片描述

The distributed unique ID generator is worth mentioning.

URL redirecting deep dive

As there are more reads than writes, <shortURL, longURL> mapping is stored in a cache to improve performance.
在这里插入图片描述

Step 4 - Wrap up

If there is extra time at the end of the interview, here are a few additional talking points.
• Rate limiter: A potential security problem we could face is that malicious users send an overwhelmingly large number of URL shortening requests. Rate limiter helps to filter out requests based on IP address or other filtering rules. If you want to refresh your memory
about rate limiting, refer to “Chapter 4: Design a rate limiter”.
• Web server scaling: Since the web tier is stateless, it is easy to scale the web tier by adding or removing web servers.
• Database scaling: Database replication and sharding are common techniques.
• Analytics: Data is increasingly important for business success. Integrating an analytics solution to the URL shortener could help to answer important questions like how many people click on a link? When do they click the link? etc.
• Availability, consistency, and reliability. These concepts are at the core of any large
system’s success. We discussed them in detail in Chapter 1, please refresh your memory on these topics.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值