- Encode and Decode TinyURL
TinyURL is a URL shortening service where you enter a URL such as https://lintcode.com/problems/design-tinyurl and it returns a short URL such as http://tinyurl.com/4e9iAk.
Design the encode and decode methods for the TinyURL service. There is no restriction on how your encode/decode algorithm should work. You just need to ensure that a URL can be encoded to a tiny URL and the tiny URL can be decoded to the original URL.
Example
Example 1:
Input:“https://lintcode.com/problems/design-tinyurl”
Output:http://tinyurl.com/4e9iAk
Explanation:encode and decode by your own algorithm.
Example 2:
Input:“https://lintcode.com/problems/solution”
Output:http://tinyurl.com/5d7fiu
Explanation:encode and decode by your own algorithm.
解法1:Base62
为什么要是base62,而不是base256(ASCII码个数)呢?这里问题在于用base256的话,string2Id好转,但是Id/256又id%256的话可能会落在ASCII里面某个非数字非字母的字符。
- 全局自增id。
- 用两个map, longUrl2id和id2longUrl,存放id和longUrl的对应关系。
- shortKey仅仅是对tinyURL的主体部分而言。注意shortKey2Id()和id2ShortKey()必须能够互相编码和解码,不然就对不上了。id2L
- encode(longUrl)时,先看longUrl是不是在longUrl2Id里面已经有对应id了,如果有id了,直接调用id2ShortKey(id)就可以了,否则id++,并保存longUrl2Id和id2longUrl的关系,然后调用id2ShortKey(id)把shortKey算出来。
- decode(shortUrl)时,先把shortUrl的主题部分shortKey找出来,然后用shortKey2Id(shortKey)算出id,然后通过id2longUrl把longUrl找出来。
注意:该解法有个缺点就是当gId变得很大时会整数溢出。可能换了long long会好些,但gId变得足够大的时候还是会溢出。
代码同步更新在
https://github.com/luqian2017/Algorithm
class Solution {
public:
string encode(string &longUrl) {
if (longUrl2id.find(longUrl) != longUrl2id.end()) {
return "http://tiny.url/" + id2ShortKey(longUrl2id[longUrl]);
}
gId++;
longUrl2id[longUrl] = gId;
id2longUrl[gId] = longUrl;
cout<<"http://tiny.url/" + id2ShortKey(gId)<<endl;
return "http://tiny.url/" + id2ShortKey(gId);
}
string decode(string shortUrl) {
const string tinyUrlHeader = "http://tiny.url/";
int pos = tinyUrlHeader.size();
long long id = shortKey2Id(shortUrl.substr(pos));
return id2longUrl[id];
}
private:
map<string, long long> longUrl2id;
map<long long, string> id2longUrl;
long long gId = 0;
long long shortKey2Id(string shortKey) {
long long id = 0;
for (int i = 0; i < min(6, (int)shortKey.size()); ++i) {
if ('a' <= shortKey[i] && shortKey[i] <= 'z') {
id = id * 62 + shortKey[i] - 'a';
} else if ('A' <= shortKey[i] && shortKey[i] <= 'Z') {
id = id * 62 + shortKey[i] - 'A' + 26;
} else if ('0' <= shortKey[i] && shortKey[i] <= '9') {
id = id * 62 + shortKey[i] - '0' + 52;
}
}
cout<<"id="<<id<<endl;
return id;
}
string id2ShortKey(long long id) {
string shortKey;
string charSet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
while(id) {
shortKey += charSet[id % 62];
id = id / 62;
}
while(shortKey.size() < 6) {
shortKey += 'a';
}
reverse(shortKey.begin(), shortKey.end());
return shortKey;
}
};
// Your Codec object will be instantiated and called as such:
// Codec codec = new Codec();
// codec.decode(codec.encode(url));
解法2:
参考了https://www.cnblogs.com/grandyang/p/6562209.html
这个解法我觉得比较好。因为没有用id而是用一个随机数,所以没有溢出问题。
代码如下:
class Solution {
public:
Solution() {
dict = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
short2long.clear();
long2short.clear();
srand(time(NULL));
}
// Encodes a URL to a shortened URL.
string encode(string longUrl) {
if (long2short.count(longUrl)) {
return "http://tinyurl.com/" + long2short[longUrl];
}
int idx = 0;
string randStr;
for (int i = 0; i < 6; ++i) randStr.push_back(dict[rand() % 62]);
while (short2long.count(randStr)) {
randStr[idx] = dict[rand() % 62];
idx = (idx + 1) % 5;
}
short2long[randStr] = longUrl;
long2short[longUrl] = randStr;
return "http://tinyurl.com/" + randStr;
}
// Decodes a shortened URL to its original URL.
string decode(string shortUrl) {
string randStr = shortUrl.substr(shortUrl.find_last_of("/") + 1);
return short2long.count(randStr) ? short2long[randStr] : shortUrl;
}
private:
unordered_map<string, string> short2long, long2short;
string dict;
};