I have a very large table that I want to add a timestamp column to. It is a table that gets a lot more updates than inserts. I am trying to figure out a way to do this without taking the table out of production for a significant amount of time and it has me in knots. I can do:
alter table stuff add column mod_time timestamp;
Well, I could do this and then the table is locked for 3-5 hours. Not a happy time for users.
For adding a varchar column, for example, I could create a new table, add the column, copy the data over and then replace the old table with the new. This last switch can be done in just a second or two during a slow-ish time. No problems from that. Of course, I may need to do the copy in stages. I can do one copy, and then do a copy of everything that changed after the first copy started. I can repeat this until nothing has changed and do the swap. Of course, it is much easier to determine what changed if I have a "mod_time" column.
What I really would rather to do is: (1) create the new table, (2) copy the data, (3) add the timestamp column, (4) swap the tables.
But step (3) takes, again, hours.
I want rows updated before this column was added to have a mod_time value set to NULL.
If I switch (2) and (3) in the step above, the swap is doable, but I get a mod_time = when I did this stuff to add the column. I want NULL.
I can try to set the values to NULL just before I do the swap, but of course if you update the row to set it to NULL, you are updating the row and the mod_time column gets set to the current time. :-)
I wish I could do: (1) create the new table, (2) add "mod_time" as a datetime column (2) copy the data, (3) change mod_time to a timestamp column, (4) swap the tables.
Theoretically (3) could be done very very quickly (assuming the storage for datetime and timestamp are compatible) because all I am really doing in changing the column to tmestamp is changing its behavior in the future, not its current storage. So this should take no time at all, yes? Probably not.
Just to clarify, if I do:
alter table stuff add column mod_time timestamp;
I get:
+----------+--------+----------+---------------------+
| col1 | col2 | col3 | mod_time |
+----------+----- --+----------+---------------------+
| 5001 | 50 | 2463 | 0000-00-00 00:00:00 |
| 5002 | 50 | 2467 | 0000-00-00 00:00:00 |
| 5003 | 50 | 2459 | 0000-00-00 00:00:00 |
This is what I want. It is just that the alter statement will take too long. Everything else I am trying gives me:
+----------+--------+----------+---------------------+
| col1 | col2 | col3 | mod_time |
+----------+----- --+----------+---------------------+
| 5001 | 50 | 2463 | 2013-12-05 18:11:21 |
| 5002 | 50 | 2467 | 2013-12-05 18:11:21 |
| 5003 | 50 | 2459 | 2013-12-05 18:11:21 |
解决方案
You should use pt-online-schema-change, a tool for automating ALTER TABLE operations for MySQL, without locking the table. It's part of the Percona Toolkit, a free collection of indispensable helper tools for MySQL developers and DBA's.
$ pt-onlines-chema-change h=localhost,D=mydatabase,t=mytable \
--alter="add column mod_time timestamp"
You can continue reading and writing the original table while pt-online-schema-change is working. When it's done, it automatically swaps the tables' names, and drops the original.
You can view a recording of a webinar about this tool by its designer. Viewing is free, but requires registration: http://www.percona.com/webinars/2012-05-02-zero-downtime-schema-changes-in-mysql