google和javaeye都很牛叉哈!
NAME
mk-table-sync - Synchronize MySQL tables efficiently. 同步mysql表
SYNOPSIS
This tool changes data, so for maximum safety, you should back up your data before you use it. 为安全最好先备份数据
To sync db.tbl1 from host1 to host2: 机器1 to 2
mk-table-sync --execute u=user,p=pass,h=host1,D=db,t=tbl host2
Sync all tables in host1 to host2 and host3:
mk-table-sync --execute host1 host2 host3
Resolve differences mk-table-checksum found on this master's slaves:
mk-table-sync --execute --replicate test.checksum master1
Make slave1 have the same data as its replication master:
mk-table-sync --execute --sync-to-master slave1
Ditto, resolving differences mk-table-checksum found:
mk-table-sync --execute --sync-to-master \ --replicate test.checksum slave1
Sync server2 in a master-master replication configuration, where server2's copy of db1.tbl1 is known or suspected to be incorrect:
mk-table-sync --execute --sync-to-master h=server2,D=db1,t=tbl1
Note that in the master-master configuration, the following will NOT do what you want, because it will make changes directly on server2, which will then flow through replication and change server1's data:
# Don't do this in a master-master setup! mk-table-sync --execute h=server1,D=db1,t=tbl1 h=server2
RISKS
The following section is included to inform users about the potential risks, whether known or unknown, of using this tool. The two main categories of risks are those created by the nature of the tool (e.g. read-only tools vs. read-write tools) and those created by bugs.
With great power comes great responsibility! This tool changes data, so it is a good idea to back up your data. It is also very powerful, which means it is very complex, so you should run it with the --dry-run option to see what it will do, until you're familiar with its operation. If you want to see which rows are different, without changing any data, use --print instead of --execute .
Be careful when using mk-table-sync in any master-master setup . Master-master replication is inherently tricky, and it's easy to make mistakes. Yyou need to be sure you're using the tool correctly for master-master replication. See the SYNOPSIS for the overview of the correct usage.
In general, this tool is best suited when your tables have a primary key or unique index . Although it can synchronize data in tables lacking a primary key or unique index, it might be best to sychronize that data by another means.
At the time of this release, there is a potential bug using --lock-and-rename with MySQL 5.1, a bug detecting certain differences, a bug using ROUND()
across different platforms, a bug mixing collations, and a bug chunking inside mk-table-checksum boundaries.
The authoritative source for updated information is always the online issue tracking system. Issues that affect this tool will be marked as such. You can see a list of such issues at the following URL: http://www.maatkit.org/bugs/mk-table-sync .
See also BUGS for more information on filing bugs and getting help.
DESCRIPTION
WARNING this tool is unfinished and could perform slowly. The Chunk algorithm is great when it can be used, and so is Nibble, but otherwise GroupBy is the default choice and it may not perform very well. Please run with --dry-run before subjecting your servers to this tool, and make backups of your data!
This tool is designed to do one-way and bidirectional synchronization of data. It finds differences efficiently with one of several algorithms (see ALGORITHMS ). It makes changes on the destination table(s) so it matches the source.
It does not synchronize table structures, indexes, or any other schema changes . It synchronizes only data.
It can operate through replication by comparing a slave with its master and making changes on the master . These changes will flow through replication and correct any differences found on the slave.
It accepts a list of DSNs (see the --help output) to tell it where and how to connect.
There are many ways to invoke it. The following is the abbreviated logic:
if DSN has a t part, sync only that table:
if 1 DSN:
if --sync-to-master:
The DSN is a slave. Connect to its master and sync.
if more than 1 DSN:
The first DSN is the source. Sync each DSN in turn.
else if --replicate:
if --sync-to-master:
The DSN is a slave. Connect to its master, find records
of differences, and fix.
else:
The DSN is the master. Find slaves and connect to each,
find records of differences, and fix.
else:
if only 1 DSN and --sync-to-master:
The DSN is a slave. Connect to its master, find tables and
filter with --databases etc, and sync each table to the master.
else:
find tables, filtering with --databases etc, and sync each
DSN to the first.
If you're confused about how it the DSNs are interpreted, use the --explain-hosts option and it will tell you.
OUTPUT
If you specify the --verbose option, you'll see information about the differences between the tables. There is one row per table. Each server is printed separately. For example,
# Syncing h=host1,D=test,t=test1 # DELETE REPLACE INSERT UPDATE ALGORITHM EXIT DATABASE.TABLE # 0 0 3 0 Chunk 2 test.test1
Table test.test1 on host1 required 3 INSERT
statements to synchronize and it used the Chunk algorithm (see ALGORITHMS ). Because differences were found, its EXIT STATUS was 2.
If you specify the --print option, you'll see the actual SQL statements that the script uses to synchronize the table if --execute is also specified.
If you want to see t he SQL statements that mk-table-sync is using to select chunks, nibbles, rows, etc., then specify --print once and --verbose twice . Be careful though: this can print a lot of SQL statements.
There are cases where no combination of INSERT
, UPDATE
or DELETE
statements can resolve differences without violating some unique key. For example, suppose there's a primary key on column a and a unique key on column b. Then there is no way to sync these two tables with straightforward UPDATE statements:
+---+---+ +---+---+ | a | b | | a | b | +---+---+ +---+---+ | 1 | 2 | | 1 | 1 | | 2 | 1 | | 2 | 2 | +---+---+ +---+---+
The tool rewrites queries to DELETE
and REPLACE
in this case. This is automatically handled after the first index violation, so you don't have to worry about it.
REPLICATION SAFETY
Synchronizing a replication master and slave safely is a non-trivial problem, in general. There are all sorts of issues to think about, such as other processes changing data, trying to change data on the slave, whether the destination and source are a master-master pair, and much more.
In general, the safe way to do it is to change the data on the master , and let the changes flow through replication to the slave like any other changes. However, this works only if it's possible to REPLACE into the table on the master. REPLACE works only if there's a unique index on the table (otherwise it just acts like an ordinary INSERT).
If your table has unique keys, you should use the --sync-to-master and/or --replicate options to sync a slave to its master. This will generally do the right thing. When there is no unique key on the table, there is no choice but to change the data on the slave, and mk-table-sync will detect that you're trying to do so. It will complain and die unless you specify --no-check-slave
(see --[no]check-slave ).
If you're syncing a table without a primary or unique key on a master-master p