flickr 对于分布式系统生成全局唯一ID的解决方案
原文地址http://code.flickr.com/blog/2010/02/08/ticket-servers-distributed-unique-primary-keys-on-the-cheap/
Ticket Servers:Distributed Unique Primary Keys on the Cheap
This is the first post in the
Ticket servers aren’t inherently interesting, but they’re animportant building block at Flickr. They are core to topics we’llbe talking about later, like sharding and master-master. Ticketservers give us globally (Flickr-wide) unique integers to serve asprimary keys in our distributed setup.
Why?
Sharding (aka
GUIDs?
Given the need for globally unique ids the obvious question is, whynot use GUIDs? Mostly because GUIDs are big, and they index badlyin MySQL. One of the ways we keep MySQL fast is we index everythingwe want to query on, and we only query on indexes. So index size isa key consideration. If you can’t keep your indexes in memory, youcan’t keep your database fast. Additionally ticket servers give ussequentiality which has some really nice properties includingmaking reporting and debugging more straightforward, and enablingsome caching hacks.
Consistent Hashing?
Some projects like
Centralizing Auto-Increments
If we can’t make MySQL auto-increments work across multipledatabases, what if we just used one database? If we inserted a newrow into this one database every time someone uploaded a photo wecould then just use the auto-incrementing ID from that table as theprimary key for all of our databases.
Of course at 60+ photos a second that table is going to get prettybig. We can get rid of all the extra data about the photo, and justhave the ID in the centralized database. Even then the table getsunmanageably big quickly. And there are comments, and favorites,and group postings, and tags, and so on, and those all need IDstoo.
REPLACE INTO
A little over a decade ago MySQL shipped with a non-standardextension to the ANSI SQL spec,
REPLACE works exactly like INSERT, except that if an old row in thetable has the same value as a new row for a PRIMARY KEY or a UNIQUEindex, the old row is deleted before the new row is inserted.
This allows us to atomically update in a place a single row in adatabase, and get a new auto-incremented primary ID.
Putting It All Together
A Flickr ticket server is a dedicated database server, with asingle database on it, and in that database there are tableslike Tickets32
Tickets64
The Tickets64 schema looks like:
CREATE TABLE `Tickets64` (
`id` bigint(20) unsigned NOT NULL auto_increment,
`stub` char(1) NOT NULL default '',
PRIMARY KEY (`id`),
UNIQUE KEY `stub` (`stub`)
) ENGINE=MyISAM
SELECT * from Tickets64
+-------------------+------+
| id | stub |
+-------------------+------+
| 72157623227190423 | a |
+-------------------+------+
When I need a new globally unique 64-bit ID I issue the followingSQL:
REPLACE INTO Tickets64 (stub) VALUES ('a');
SELECT LAST_INSERT_ID();
SPOFs
You really really don’t know want provisioning your IDs to be asingle point of failure. We achieve “high availability” by runningtwo ticket servers. At this write/update volume replicating betweenthe boxes would be problematic, and locking would kill theperformance of the site. We divide responsibility between the twoboxes by dividing the ID space down the middle, evens and odds,using:
TicketServer1:
auto-increment-increment = 2
auto-increment-offset = 1
TicketServer2:
auto-increment-increment = 2
auto-increment-offset = 2
We round robin between the two servers to load balance and dealwith down time. The sides do drift a bit out of sync, I think wehave a few hundred thousand more odd number objects then evenlynumbered objects at the moment, but this hurts no one.
More Sequences
We actually have more tables thenjust Tickets32
Tickets64
So There’s That
It’s not particularly elegant, but it works shockingly well for ushaving been in production since Friday the 13th, January 2006, andis a great example of the Flickr engineeringdumbestpossible thing that will work
More soon.
Belorussian