pt-table-checksum 中文使用说明

                               pt-table-checksum

2.27.1 NAME
pt-table-checksum - Verify MySQLreplication integrity.  >>pt-table-checksum用来检查主从数据是否一致



2.27.2 SYNOPSIS
Usage
pt-table-checksum [OPTIONS] [DSN]
pt-table-checksum performs an online replication consistency check by executing checksum queries on the master, which produces different results on replicas that are inconsistent with the master. The optional DSN specifies the master host. The tool’s “EXIT STATUS” is non-zero if any differences are found,or if any warnings or errors occur. The following command will connect to the replication master on localhost,checksum every table, and report the results on every detected replica:  

>>pt-tablechecksum通过在主库上执行checksum(该操作会通过被复制到从库,从而在从库上对相应表也执行相同的checksum操作。所以使用该工具要求主库的binlog_format=statment),然后比对主从库上面相同表的checksum值是否相同,如果不同表示表的数据不一致。如果pt-table-checksum执行过程发现主从数据不一致,或者产生任何警告,或者有任何报错,那么该工具的退出状态都不为0(换句话说,如果说该工具执行的退出状态为0,则表是没有任何问题)

pt-table-checksum
This tool is focused on finding data differences efficiently. If any data is different, you can resolve the problem with pt-table-sync.  >>pt-table-checksum工具用来检查mysql 主从数据是否一致。如果发现主从数据不一致,可以通过pt-table-sync工具来处理



2.27.3 RISKS
Percona Toolkit is mature, proven in the real world, and well tested, but all database tools can pose a risk to the system
and the database server. Before using this tool, please: percon toolkit 是成熟的,经过充分测试的,并在生产环境中使用的,但是所有的数据库工具在使用时都有可能遇到问题,所以在使用该工具前请:
• Read the tool’s documentation  >>使用前阅读工具文档
• Review the tool’s known “BUGS”  >>使用前查看工具的相关bug
• Test the tool on a non-production server  >>使用前在非生产环境进行测试
• Backup your production server and verify the backups  >>使用前备份生产数据库,并验证备份有效
See also “LIMITATIONS”.


2.27.4 DESCRIPTION
pt-table-checksum is designed to do the right thing by default in almost every case. When in doubt, use--explain to see how the tool will checksum a table. The following is a high-level overview of how the tool functions. In contrast to older versions of pt-table-checksum, this tool is focused on a single purpose, and does not have a lot
of complexity or support many different checksumming techniques. It executes checksum queries on only one server,and these flow through replication to re-execute on replicas. If you need the older behavior, you can use Percona Toolkit version 1.0.  >>pt-table-checksum 被设计成几乎在所有的情况下都能正常工作。可以通过--explain参数查看该工具怎么checksum一个表。下面会详细描述该工具如何工作。与老的版本相比,该版本的pt-table-checksum主要用来检查主从数据一致,不支持其他复杂的功能和高级技术。它在主库执行checksum操作,该操作会通过binlog传递到从库并在从库执行(需要主库的binlog_format为statment格式)。如果你需要使用一些老版本pt-table-checksum工具支持的功能,请使用1.0版本的pt-table-checksum


pt-table-checksum connects to the server you specify, and finds databases and tables that match the filters you specify (if any). It works one table at a time, so it does not accumulate large a mounts of memory or do a lot of work before beginning to checksum. This makes it usable on very large servers. We have used it on servers with hundreds of thousands of databases and tables, and trillions of rows. No matter how large the server is,pt-table-checksum works equally well.  >>pt-table-checksum连接到你指定的数据库实例,对你所指定的数据库和表进行checksum操作。它每次只处理一张表(如果表很大,则把表分成多分来处理),所以它不会消耗很多的系统资源。就这使pt-table-checksum在非常大的服务器上也能工作。我们曾经在一些包含成千上万的数据库和表以及亿万的数据量的实例上使用过该工具。无论实例的规模多大,pt-table-checksum都处理的很好。
One reason it can work on very large tables is that it divides each table into chunks of rows, and checksums each chunk with a single REPLACE..SELECT query. It varies the chunk size to make the checksum queries run in the desired amount of time. The goal of chunking the tables, instead of doing each table with a single big query, is to ensure that checksums are unintrusive and don’t cause too much replication lag or load on the server. That’s why the target time for each chunk is 0.5 seconds by default.    >>pt-table-checksum之所以它能很好的处理大表,是因为pt-table-checksum会把那些大表分成多个部分(chunk),然后对每个chunk进行checksum操作。pt-table-checksum不断的调整chunk的尺寸,来保证checksum操作在期望的时间内完成(由--chunk-time参数指定)。pt-table-checksum把表分成不同的chunk来进行checksum操作,而不是直接对整个表进行checksum,是为了避免因为大事物造成复制出现大的延迟,并且加大服务器的负载。把--chunk-time的默认值设置为0.5秒也是出于同样的原因。
The tool keeps track of how quickly the server is able to execute the queries,and adjusts the chunks as it learns more about the server’s performance. It uses an exponentially decaying weighted average to keep the chunk size stable, yet remain responsive if the server’s performance changes during checksumming for any reason. This means that the tool will quickly throttle itself if your server becomes heavily loaded during a traffic spike or a background task, for example.  >>该工具会跟踪checksum的执行速度,根据服务器的繁忙程度调整chunk的大小。它通过指数衰减加权平均法来调整chunk的尺寸。这就意味着在服务器负载变大(例如因为业务高峰或者一些后台工作)时pt-table-checksum能够快速的做出反应,减少pt-table-checksum的工作量。
Chunking is accomplished by a technique that we used to call “nibbling” in other tools in Percona Toolkit. It is the same technique used for pt-archiver, for example. The legacy chunking algorithms used in older versions of pt-tablechecksum are removed,because they did not result in predictably sized chunks, and didn’t work well on many tables.  >>pt-table-checksum通过“nibbling”技术来实现大表的chunk操作。老版本中使用的传统的chunk算法已经被放弃,因为它无法预见chunk的大小,并且在许多表上工作的并不理想。
All that is required to divide a table into chunks is an index of some sort(preferably a primary key or unique index). If there is no index, and the table contains a suitably small number of rows, the tool will checksum the table in a single chunk.  >>所以的chunk算法都需要根据索引来进行chunk(首选主键或者唯一索引)。如果没有索引,并且表也比较小,则不需要对表进行chunk,pt-table-checksum对整个表执行checksum


pt-table-checksum has many other safeguards to ensure that it does not interfere with any server’s operation, including replicas. To accomplish this,pt-table-checksumdetects replicas and connects to them automatically. (If this fails,you can give it a hint with the--recursion-methodoption.)  >>pt-table-checksum通过很多方法来保证checksum工作不会对数据库的工作和复制产生影响。默认情况下pt-table-checksum会自动发现并连接所有从库。(如果自动发现和连接操作出现我们可以通过--recursion-method来指定发现从库的方法)


The tool monitors replicas continually. If any replica falls too far behind inreplication,pt-table-checksum pauses to allow it to catch up. If any replica has an error, or replication stops,pt-table-checksum pauses and waits. In addition, pt-table-checksumlooks for common causes of problems, such as replication filters, and refuses to operate unless you force it to. Replication filters are dangerous, because the queries thatpt-table-checksum executes could potentially conflict with them and cause replication to fail.  >>该工具会持续监控从库,如果从库延迟太大,就会暂停checksum操作,使从库能够追上主库。如果任何从库有报错,或者复制进行停止,pt-table-checksum会暂停checksum并一直等带从库恢复。在pt-table-checksum工作之前会进行检查,如果发现配置了replication filters,它会自动退出(除非指定--no-check-replication-filters),因为checksum操作和replication filters很可能会产生冲突,造成复制失败。


pt-table-checksum verifies that chunks are not too large to checksum safely. It performs an EXPLAIN query on each chunk, and skips chunks that might be larger than the desired number of rows.You can configure the sensitivity of this safeguard with the --chunk-size-limit option. If a table will be checksummed in a single chunk because
it has a small number of rows, then pt-table-checksum additionally verifies that the table isn’t oversized on replicas. This avoids the following scenario: a table is empty on the master but is very large on a replica, and is checksummed in a single large query, which causes a very long delay in replication.  >>pt-table-checksum会检查chunks是否过大,它会对所有的chunk进行EXPLAIN query,并且跳过行数大于--chunk-size-limit指定值的chunk。如果主库上某个表很小,被当做一个chunk,那么这时pt-table-checksum会另外检查从库上该表的大小。避免出现如下情况:某个表在主库上是空的但是在从库上确很大,因为在主库上被当做一个chunk来进行checksum操作,该checksum操作在从库上执行时,就会导致从库产生很大的延迟。
There are several other safeguards. For example, pt-table-checksum sets its session-level innodb_lock_wait_timeout to 1 second, so that if there is a lock wait, it will be the victim instead of causing other queries to time out. Another safeguard checks the load on the database server, and pauses if the load is too high. There is no single right answer for how to do this, but by default pt-table-checksumwill pause if there are more than 25 concurrently executing queries. You should probably set a sane value for your server with the--max-loadoption.  >>pt-table-checksum还有一些其他的保护措施,例如,可以设置它的session级别的innodb_lock_wait_timeout值为1秒,这样如果发生lock wait,pt-table-checksum会话会被驱除,不会影响正常的应用操作。另一个保护措施是通过--max-load限制服务器负载,如果达到指定的负载限制,则暂停checksum操作。当然怎么指定--max-load值没有一个放之四海而皆准的指标。该参数的默认值是Threads_running=25,即当并发运行的线程大于25时,就暂停pt-table-checksum工作。你可以根据你的服务器情况为--max-load指定合理的值。
Checksumming usually is a low-priority task that should yield to other work on the server. However, a tool that must be restarted constantly is difficult to use. Thus,pt-table-checksumis very resilient to errors. For example, if the database administrator needs to killpt-table-checksum‘s queries for any reason, that is not a fatal error. Users often run pt-kill to kill any long-running checksum queries. The tool will retry a killed query once, and if it fails again, it will move on to the next chunk of that table. The same behavior applies if there is a lock wait timeout. The tool will print a warning if such an error happens, but only once per table. If the connection to any server fails,pt-table-checksum will attempt to reconnect and continue working.  >>checksum 相对于数据库上的其他工作而言是低优先级工作。pt-table-checksum是一个非常有弹性的工具。例如,因为某原因pt-table-checksum的会话被杀掉(这不是致命错误),pt-table-checksum会重新执行一次被中断checksum操作,如果再次失败,则跳过该chunk,对一下chunk进行checksum(该情况同样适用于发生lock wait timeout的情况)。如果再次执行失败pt-table-checksum会输出警告,但是每张表只会在第一次发生该情况时输出警告。如果连接数据库实例失败,pt-table-checksum会尝试重连并继续工作。
If pt-table-checksum encounters a condition that causes it to stop completely, it is easy to resume it with the --resumeoption. It will begin from the last chunk of the last table that it processed. You can also safely stop the tool with CTRL-C. It will finish the chunk it is currently processing, and then exit. You can resume it as usual
afterwards.   >>如果因为某原因导致pt-table-checksum完全终止工作(pt-table-checksum遭遇非致命错误时会自动继续工作),我们可以通过指定--resume参数使它接着上次继续工作。它将从上次退出时处理的最后一张表的最后一个chunk开始继续工作。你也可以通过CTRL-C来正常的退出pt-table-checksum,它将在处理完正在处理的chunk后正常exit。之后你可以通过指定--resume紧接着上次继续处理。
After pt-table-checksum finishes checksumming all of the chunks in a table, it pauses and waits for all detected replicas to finish executing the checksum queries. Once that is finished, it checks all of the replicas to see if they have the same data as the master, and then prints a line of output with the results.You can see a sample of its output later in this documentation.  >>在pt-table-checksum处理完主库某个表的所有chunk后,它会暂停并等待所有它探测到的所有从库中该表完成同样的checksum操作。一旦所有从库中该表的checksum操作完成,它会对比所有从库中该表的checksum值是否同主库一致,并打印一行检查结果到标准输出。在该文档的下面部分你会看到输出样例。
The tool prints progress indicators during time-consuming operations. It prints a progress indicator as each table is checksummed. The progress is computed by the estimated number of rows in the table. It will also print a progress report when it pauses to wait for replication to catch up, and when it is waiting to check replicas for differences from the master. You can make the output less verbose with the--quiet option.  >>对于一些耗时的操作pt-table-checksum会打印进度指示。在对每个表进行checksum操作时输出进度指示信息,通过估算表中被处理的行数来表示进度。在pt-table-checksum暂停checksum操作等待从库追上主库或者暂停checksum操作检查主从某个表的数据是否一致时,它也会输出进度指示信息。可以通过指定--quiet参数简化pt-table-checksum的输出信息。
If you wish, you can query the checksum tables manually to get a report of which tables and chunks have differences from the master. The following query will report every database and table with differences, along with a summary of the number of chunks and rows possibly affected:  >>可以执行下面的sql,查询哪些表主从数据不一致
SELECT db, tbl, SUM(this_cnt)AS total_rows, COUNT(*)AS chunks
FROM percona.checksums
WHERE (
master_cnt <> this_cnt
OR master_crc <> this_crc
OR ISNULL(master_crc)<>ISNULL(this_crc))
GROUP BY db, tbl;
The table referenced in that query is the checksum table, where the checksums are stored. Each row in the table contains the checksum of one chunk of data from some table in the server.  >>上面语句中引用的表即用来存储checksum数据的表。表中的每一行表示一个chunk的checksum数据。
Version 2.0 of pt-table-checksum is not backwards compatible withpt-table-sync version 1.0. In some cases this is not a serious problem. Adding a “boundaries” column to the table, and then updating it with a manually generated WHERE clause, may suffice to let pt-table-sync version 1.0 interoperate with pt-table-checksum version 2.0. Assuming an integer primary key named ‘id’, You can try something like the following:  >>pt-table-checksum 2.0版和之前的1.0是不相互兼容的。
ALTER TABLE checksums ADD boundaries VARCHAR(500);
UPDATE checksums
SET boundaries = COALESCE(CONCAT('id BETWEEN ', lower_boundary,
' AND ', upper_boundary),'1=1');



2.27.5 LIMITATIONS  >>pt-table-checksum工具的限制


Replicas using row-based replication 
pt-table-checksum requires statement-based replication, and it sets binlog_format=STATEMENT on the master, but due to a MySQL limitation replicas do not honor this change.Therefore, checksums will not replicate past any replicas using row-based replication that aremasters for further replicas. The tool automatically checks the binlog_format on all servers. See --[no]check-binlog-format.  >>因为pt-table-checksum需要把在主库执行的checksum操作以statment的形式同步到从库,并在从库执行,所以在主库执行pt-table-checksum时会执行set binlog_format=statment操作,但是因为mysql的限制,修改参数的操作并不会被同步到从库。所以如果从库binlog_format=row,并且从库下还有级联从库的话,那么在中间从库执行的checksum操作就不能以语句的形式被同步到下一级从库。
(Bug 899415)


Schema and table differences The tool presumes that schemas and tables are identical on the master and all replicas. Replication will break if, for example, a replica does not have a schema that exists on the master (and that schema is checksummed), or if the structure of a table on a replica is different than onthe master.  >>pt-table-checksum假设从库和主库上用户相同的数据库和表。如果某个主库正在checksum的数据库在从库不存,或者主库正在checksum某个表,但是从库中该表的结构同主库不一致,这些都会导致从库的复制出错。



2.27.6 Percona XtraDB Cluster
pt-table-checksum works with Percona XtraDBCluster (PXC) 5.5.28-23.7 and newer. The number of possible Percona XtraDBCluster setups is large given that it can be used with regular replication aswell. Therefore, only the setups
listed below are supported and known to work. Other setups, like cluster tocluster, are not support and probably don’t
work.
Except where noted, all of the following supported setups require that you usethe dsn method for
--recursion-method to specify cluster nodes.Also, the lag check (see “REPLICA CHECKS”) is not performed
for cluster nodes.
Single cluster
The simplest PXC setup is a single cluster: all servers are cluster nodes, andthere are no regular replicas.
If all nodes are specified in the DSN table (see --recursion-method), then you can run the tool on
any node and any diffs on any other nodes will be detected.
All nodes must be in the same cluster (have the same wsrep_cluster_name value),else the tool exits
with an error. Although it’s possible to have different clusters with the samename, this should not be done
and is not supported. This applies to all supported setups.
Single cluster with replicas
Cluster nodes can also be regular masters and replicate to regular replicas.However, the tool can only
detect diffs on a replica if ran on the replica’s “master node”. For example,if the cluster setup is,
node1 <-> node2 <-> node3
| |
| +-> replica3
+-> replica2
you can detect diffs on replica3 by running the tool on node3, but todetect diffs on replica2 you must run
the tool again on node2. If you run the tool on node1, it will not detect diffson either replica.
Currently, the tool does not detect this setup or warn about replicas thatcannot be checked (e.g. replica2
when running on node3).
Replicas in this setup are still subject to --[no]check-binlog-format.
Master to single cluster
It is possible for a regular master to replicate to a cluster, as if thecluster were one logical slave, like:
master -> node1 <-> node2 <-> node3
The tool supports this setup but only if ran on the master and if allnodes in the cluster are consistent with
the “direct replica” (node1 in this example) of the master. For example, if allnodes have value “foo” for
row 1 but the master has value “bar” for the same row, this diff will bedetected. Or if only node1 has this
diff, it will also be detected. But if only node2 or node3 has this diff, itwill not be detected. Therefore,
this setup is used to check that the master and the cluster as a whole areconsistent.
In this setup, the tool can automatically detect the “direct replica” (node1)when ran on the master, so you
do not have to use the dsn method for --recursion-methodbecause node1 will represent the entire
cluster, which is why all other nodes must be consistent with it.
The tool warns when it detects this setup to remind you that it only works whenused as described above.
These warnings do not affect the exit status of the tool; they’re onlyreminders to help avoid false-positive
results.


2.27.7 OUTPUT  >>pt-table-checksum输出示例

The tool prints tabular results, one line per table:
TS ERRORS DIFFS ROWS CHUNKS SKIPPED TIME TABLE
10-20T08:36:50 0 0 200 1 0 0.005 db1.tbl1
10-20T08:36:50 0 0 603 7 0 0.035 db1.tbl2
10-20T08:36:50 0 0 16 1 0 0.003 db2.tbl3
10-20T08:36:50 0 0 600 6 0 0.024 db2.tbl4
Errors, warnings, and progress reports are printed to standard error.See also --quiet.  >>错误,警告,进度指示被输出到标准输出
Each table’s results are printed when the tool finishes checksumming the table.The columns are as follows:  >>在主库和从库中该表的checksum操作都完成后,会对比主库和从库的checksum值,并输出一行结果,包含如下列:
TS The timestamp (without the year) when the tool finished checksumming the table.  >>完成该表checksum的时间
ERRORS The number of errors and warnings that occurred while checksumming the table. Errors and warnings are
printed to standard error while the table is in progress.  >>checksum该表过程中产生的error和warning的次数。error和warning在发生时被打印到标准输出
DIFFS The number of chunks that differ from the master on one or more replicas. If --no-replicate-check is specified, this column will always have zeros. If--replicate-check-only is specified, then only tables with differences are printed.  >>一个或多个从库上某个表有多少个chunk不一致,如果指定了--no-replicate-check,该列都为0,如果指定--replicate-check-only则只打印数据不一致的表
ROWS The number of rows selected and checksummed from the table. It might be different from the number of rows  in the table if you use the –where option.  >>表中被执行checksum的行数。
CHUNKS The number of chunks into which the table was divided.  >>表被分割为多少个chunk
SKIPPED The number of chunks that were skipped due one or more of these problems:  >>因为如下问题而被跳过的chunk
* MySQL not using the --chunk-index  >>在checksum时mysql没有使用--chunk-index指定的index
* MySQL not using the full chunk index (--[no]check-plan)  >>checksum时没有使用最佳执行计划时,详情请见--[no]check-plan参数的解释
* Chunk size is greater than --chunk-size * --chunk-size-limit  >>如果chunk size超过了--chunk-size*--chunk-size-limit

* Lock wait timeout exceeded (--retries)  >>对某个chunk进行checksum时,发生lock wait timeout,pt-table-checksum重试checksum该chunk如果再次发生lock wait timeout则跳过该chunk

* Checksum query killed (--retries )  >>正在进行checksum的会话被杀后,会重试对之前的chunk进行checksum,如果再次被杀,则跳过该chunk,对一下个chunk进行checksum
As of pt-table-checksum 2.2.5, skipped chunks cause a non-zero“EXIT STATUS”.  >>对于2.2.5版本的pt-table-checksum,如果在检查过程中跳过了chunk,那么它的退出状态就为非0的。
TIME The time elapsed while checksumming the table.  >>checksum某个表所消耗的时间
TABLE The database and table that was checksummed.  >>被checksum的表名,格式为 database.table


If --replicate-check-only is specified, only checksum differences on detected replicas are printed.   >>如果指定了--replicate-check-only,则不进行checksum操作,只检查上次pt-table-checksum产生的结果(--on-replicate-only),并输出主从数据不一致的相关表信息。

The output is different: one paragraph per replica, one checksum difference perline, and values are separated by spaces:  >>输出格式如下(为每个从库输出一个单独的段落,每个不一致的chunk都会输出一行信息,列以空格分隔):
Differences on h=127.0.0.1,P=12346
TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARY
db1.tbl1 1 0 1 PRIMARY 1 100
db1.tbl1 6 0 1 PRIMARY 501 600
Differences on h=127.0.0.1,P=12347
TABLE CHUNK CNT_DIFF CRC_DIFF CHUNK_INDEX LOWER_BOUNDARY UPPER_BOUNDARY
db1.tbl1 1 0 1 PRIMARY 1 100
db2.tbl2 9 5 0 PRIMARY 101 200
The first line of a paragraph indicates the replica with differences. In this example there are two: h=127.0.0.1,P=12346
and h=127.0.0.1,P=12347. The columns are as follows:  >>每段的第一行显示存在不一致的从库信息,如上面的例子中有两个从库h=127.0.0.1,P=12346和h=127.0.0.1,P=12347
TABLE The database and table that differs from the master.  >>同主库不数据不一致的表名 格式为database.table
CHUNK The chunk number of the table that differs from the master.  >>表中存在数据不一致的chunk号
CNT_DIFF The number of chunk rows on the replica minus the number of chunk rows on the master.  >>该chunk在从库上的行数与主库行数的差值
CRC_DIFF 1 if the CRC of the chunk on the replica is different than the CRC of the chunk on the master, else 0.  >>
CHUNK_INDEX The index used to chunk the table.  >>chunk table时使用的索引
LOWER_BOUNDARY The index values that define the lower boundary of the chunk.  >>该chunk索引值的下边界
UPPER_BOUNDARY The index values that define the upper boundary of thechunk.    >>该chunk索引值的上边界


2.27.8 EXIT STATUS
pt-table-checksum has three possible exit statuses: zero, 255, and any other value is a bitmask with flags for different problems.  >>pt-table-checksum退出状态可以分为三种,0,255,其他值。
A zero exit status indicates no errors, warnings, or checksum differences, or skipped chunks or tables.  >>0表是没有error,warning,主从数据一致,检查过程中没有跳过chunk或者表。
A 255 exit status indicates a fatal error. In other words: the tool died orcrashed. The error is printed to STDERR.  >>255表示pt-table-checksum执行过程中遇到致命错误,导致异常终止,错误会被输出到标准输出。
If the exit status is not zero or 255, then its value functions as a bitmask with these flags:  >>其他退出状态信息如下:
FLAG BIT VALUE MEANING
================ ========= ==========================================
ERROR 1 A non-fatal error occurred  >>1表示检查过程中遇到非致命错误
ALREADY_RUNNING 2 --pidfile exists and the PID is running  >>2表示已经有pt-table-checksum进程在运行
CAUGHT_SIGNAL 4 Caught SIGHUP, SIGINT, SIGPIPE, or SIGTERM  >>4表示进程接收到了signal
NO_SLAVES_FOUND 8 No replicas or cluster nodes were found  >>8表示没有发现从库或者集群节点
TABLE_DIFF 16 At least one diff was found  >>16表示至少发现一个chunk或者表存在数据不一致
SKIP_CHUNK 32 At least one chunk was skipped  >>32表示至少有一个或者以上的chunk被跳过
SKIP_TABLE 64 At least one table was skipped  >>64表示至少有一个或者以上的表被跳过
If any flag is set, the exit status will be non-zero. Use the bitwise ANDoperation to check for a particular flag. For
example, if $exit_status & 16 is true, then at least one diff was found.
As of pt-table-checksum 2.2.5, skipped chunks cause a non-zero exitstatus. An exit status of zero or 32 is equivalent
to a zero exit status with skipped chunks in previous versions of the tool.



2.27.9 OPTIONS  
This tool accepts additional command-line arguments. Refer to the“SYNOPSIS” and usage information for details.  >>下面是pt-table-checksum相关参数
--ask-pass  >>连接mysql的时候提供密码(--password这个参数是直接在命令行中写密码)
group: Connection
Prompt for a password when connecting to MySQL.


--[no]check-binlog-format  >>检查是否所有实例的binlog_format参数都一直,参考pt-table-checksum limitation
default: yes
Check that the binlog_format is the same on all servers.
See “Replicas using row-based replication” under “LIMITATIONS”.


--binary-index  >>指定该参数创建checksum表时,upper/lower boundary列类型被设置为BLOG. 在如下两种情况下1,要检查的表包含binary列索引;2,或者表使用非标准字符集时指定--binary-index效果会非常好
This option modifies the behavior of --create-replicate-table such that the replicate table’s upper
and lower boundary columns are created with the BLOB data type. This is useful in cases where you have
trouble checksuming tables with keys that include a binary data type or that have non-standard character sets. 
See --replicate.


--check-interval  >>检查--max-lag的时间间隔
type: time; default: 1; group: Throttle
Sleep time between checks for --max-lag.

--[no]check-plan  >>默认值为yes
default: yes
Check query execution plans for safety. By default, this option causes pt-table-checksum to run EXPLAIN before running queries that are meant to access a small amount of data, but which could access many rows if MySQL chooses a bad execution plan. These include the queries to determine chunk boundaries and the chunk
queries themselves. If it appears that MySQL will use a bad query execution plan, the tool will skip the chunk of the table.  >>为了安全默认的在pt-table-checksum执行查询确认chunk边界和执行查询生成chunk之前会先执行explain,检查执行计划,如果认为该执行计划比较槽糕,则会跳过该chunk
The tool uses several heuristics to determine whether an execution plan is bad.The first is whether EXPLAIN reports that MySQL intends to use the desired index to access the rows. If MySQL chooses a different index, the tool considers the query unsafe.  >>pt-table-checksum通过多种方法确认执行计划是否槽糕。首先它通过explain判断执行计划是否使用了它认为合理的索引,如果没有,那么它认为该执行计划是槽糕的
The tool also checks how much of the index MySQL reports that it will use for the query. The EXPLAIN output shows this in the key_len column. The tool remembers the largest key_len seen, and skips chunks where MySQL reports that it will use a smaller prefix of the index. This heuristiccan be understood as skipping chunks that have a worse execution plan than other chunks.  >>再者,在对某个表进行query(确认chunk边界或者生成chunk的query)时,会记录每次执行计划所使用索引长度,如果某次query所使用的索引长度小于之前使用过的最大长度(即使用了多列索引的某几个前缀列)则认为执行计划比较糟糕(注意这一条是对多列索引来说)
The tool prints a warning the first time a chunk is skipped due to a bad execution plan in each table. Subsequent chunks are skipped silently, although you can see the count of skipped chunks in the SKIPPED column in the tool’s output.  >>任何表第一次有chunk被跳过时,会输出警告到标准输出,如果后续有该表的其他chunk被跳过,不在警告。当该表检查完成后,你可以在输出结果的SKIPPED列查看该表中有多少个chunk被跳过
This option adds some setup work to each table and chunk. Although the work is not intrusive for MySQL, it results in more round-trips to the server, which consumes time. Making chunks too small will cause the overhead to become relatively larger. It is therefore recommended that you not make chunks too small, because the tool may take a very long time to complete if you do.  >>指定--check-plan=yes会增加pt-table-checksum工作量,虽然不会影响数据库正常业务,但是会消耗更多的时间。


--[no]check-replication-filters  >>检查主从过滤
default: yes; group: Safety
Do not checksum if any replication filters are set on any replicas. The tool looks for server options that filter replication, such as binlog_ignore_db and replicate_do_db. If it finds any such filters, it aborts with an error. If the replicas are configured with any filtering options, you should becareful not to checksum any databases or tables that exist on the master and not the replicas. Changes to such tables might normally be skipped on the replicas because of the filtering options, but the checksum queries modify the contents of the table that stores the checksums, not the tables whose data you are checksumming. Therefore, these queries will be executed on the replica, and if the table or database you’re checksumming does not exist,the queries will cause replication to fail. For more information on replication rules, see http://dev.mysql.com/doc/en/replication-rules.html. Replication filtering makes it impossible to be sure that the checksum queries won’t break replication (or simply
fail to replicate). If you are sure that it’s OK to run the checksum queries,you can negate this option to disable the checks. See also--replicate-database
See also “REPLICA CHECKS”.   >>默认值是yes,即当检查到主库或者任何从库设置了复制过滤参数(如binlog_ignore_db,replicate_do_db),则退出工作。如果设置了复制过滤参数,那么你应该确保不要对那些主库存在,但从库不存在的表进行检查。如果你检查这样的表,会在从库报表不存在,这样会造成从库复制报错。如果你确认你的pt-table-checksum操作不会造成从库复制出错(可以指定只检查部分库和表,而不是检查所有),可以指定--on-check-replication-filters


--check-slave-lag  >>指定只检查某个库的延迟(--max-lag)
type: string; group: Throttle
Pause checksumming until this replica’s lag is less than --max-lag. The value is a DSN that inherits properties from the master host and the connection options (--port,--user, etc.). By default, pt-table-checksum monitors lag on all connected replicas, but this option limits lag monitoringto the specified replica. This is useful if certain replicas are intentionally lagged (with pt-slave-delay for example), in which case you can specify a normal replica to monitor.
See also “REPLICA CHECKS”.  >>默认情况下,在有从库延时大于--max-lag指定值时,pt-table-checksum会停止checksum工作,直到所有从库延时都小于--max-lag。我们可以通过--check-slave-lag(DSN)指定某个从库,让pt-table-checksum只检查该从库的延时,只要该从库的延迟小于--max-lag则继续checksum。在一些从库和主库必须有延时情况下(pt-slave-delay)该参作用明显。


--[no]check-slave-tables  >>
default: yes; group: Safety
Checks that tables on slaves exist and have all the checksum --columns. Tables missing on slaves or not having all the checksum--columnscan cause the tool to break replication when it tries to check for differences. Only disable this check if you are aware of the risks and are sure that all tables on all slaves exist and are identical to the master.  >>默认pt-table-checksum会检查主库的表在从库的表是否存在,并且检查表中是否包含--columns(一般在检查指定表时使用)指定的所有列。如果主库的某张表在从库不存在,或者从库的表并不包含--columns指定的所有列,在对该表checksum时会导致从库复制失败。只有当你确认主从数据库表结构完全一致时才建议禁用该选项


--chunk-index  >>指定index根据该index chunking tables
type: string
Prefer this index for chunking tables. By default, pt-table-checksum chooses the most appropriate index for chunking. This option lets you specify the index that you prefer. If the index doesn’t exist, then pt-tablechecksum will fall back to its default behavior of choosing an index.pt-table-checksum adds the index to the checksum SQL statements in a FORCE INDEX clause. Be careful when using this option; a poor choice of index could cause bad performance. This is probably best to use when you are checksumming only a single table, not an entire server.  >>默认情况下pt-table-checksum选择最合适的index来对表进行chunk(一般都是用主键或者唯一索引)。通过该参数我们可以指定使用某个index对表进行chunk,如果你指定的索引不存在,pt-table-checksum会自动选择一个最合适的索引来进行chunk表。注意如果你选择的索引不合适,会导致语句性能下降。一般都是在单独检查某个表时,使用该参数,而不是在检查所有表时使用。


--chunk-index-columns  >>
type: int
Use only this many left-most columns of a --chunk-index. This works only for compound indexes, and is useful in cases where a bug in the MySQL query optimizer (planner) causes itto scan a large range of rows instead of using the index to locate starting and ending points precisely. This problem sometimes occurs on indexes with many columns, such as 4 or more. If this happens, the tool might print a warning related to the --[no]check-planoption. Instructing the tool to use only the first N columns of the index is a work around for the bug in some cases.   >>指定只使用--chunk-index所指定索引的某几个前缀列。该参数只适用于--chunk-index指定的索引为符合索引的情况。因为mysql在使用多列(一般是4列或4列以上)符合索引时可能有遇到bug,如果遇到类似bug,我们可以通过该参数指定只是用符合索引的最前面几列,避免因为bug而无法使用索引


--chunk-size
type: size; default: 1000
Number of rows to select for each checksum query. Allow able suffixes are k, M,G. You should not use this option in most cases; prefer--chunk-timeinstead.
This option can override the default behavior, which is to adjust chunk size dynamically to try to make chunks run in exactly--chunk-timeseconds. When this option isn’t set explicitly, its default value isused as a starting point, but after that, the tool ignores this option’s value. If you set this option explicitly, however, then it disables the dynamic adjustment behavior and tries to make all chunks exactly the specified number of rows. There is a subtlety: if the chunk index is not unique, then it’s possible that chunks will be larger than desired. For example, if a table is chunked by an index that contains 10,000 of a givenvalue, there is no way to write a
WHERE clause that matches only 1,000 of the values, and that chunk will be atleast 10,000 rows large. Such a chunk will probably be skipped because of--chunk-size-limit. Selecting a small chunk size will cause the tool to become much slower, in partbecause of the setup work required for--[no]check-plan.

>>指定chunk包含的记录数。可以结合K,M,G等单位一起使用。在多数情况下使用--chunk-time是比--chunk-size更好的选择。如果没有显示的指定--chunk-size(默认值1000),在pt-table-checksum开始工作时会以--chunk-size默认值作为第一个chunk的大小,在这之后就会忽略--chunk-size,而根据--chunk-time值动态的调整chunk的大小,确保chunk的checksum操作能在--chunk-time指定时间内完成。如果你显示的指定了--chunk-size,那么pt-table-checksum会忽略--chunk-time参数的作用,而根据--chunk-size值来确定每个chunk的大小。


--chunk-size-limit
type: float; default: 2.0; group: Safety
Do not checksum chunks this much larger than the desired chunk size. When a table has no unique indexes, chunk sizes can be inaccurate. This option specifies a maximum tolerable limit to the inaccuracy. The tool uses <EXPLAIN> to estimate how many rows are in the chunk. If that estimate exceeds the desired chunk size times the limit (twice as large, by default),then the tool skips the chunk. The minimum value for this option is 1, which means that no chunk can be larger than--chunk-size. You
probably don’t want to specify 1, because rows reported by EXPLAIN are estimates, which can be different from the real number of rows in the chunk. If the tool skips too many chunksbecause they are oversized, you might want to specify a value larger than the default of 2. You can disable oversized chunk checking by specifying a value of 0.  >>如果chunk的大小(默认情况下pt-table-checksum根据--chunk-time自动调整chunk的大小)大于chunk-size*chunk-size-limit,则不对该chunk进行checksum。当表中没有唯一索引(主键或者唯一键)时,chunk大小可能划分的不准确。通过--chunk-size-limit和--chunk-size两个参数指定chunk 大小的极限值。pt-table-checksum通过explain估算chunk的大小,如果chunk的估算大小大于chunk-size*chunk-size-limit,则跳过该chunk。--chunk-size-limit可以指定的最小值是1(即任何大于--chunk-size的chunk都会被跳过),一般我们不会指定该参数为1。该参数默认值为2,你可以通过指定该参数值为0,来禁用该限制。

--chunk-time

type: float; default: 0.5
Adjust the chunk size dynamically so each checksum query takes this long to execute. The tool tracks the checksum rate (rows per second) for all tables and each table individually. It uses these rates to adjust the chunk size after each checksum query, so that the next check sumquery takes this amount of time (in seconds) to execute.
The algorithm is as follows: at the beginning of each table, the chunk size isinitialized from the overall average rows per second since the tool began working, or the value of--chunk-size if the tool hasn’t started working yet. For each subsequent chunk of a table, the tool adjusts the chunk size to try to make queries run in the desired amount of time. It keeps an exponentially decaying moving average of queries per second, so that if the server’s performance changes due to changes in server load, the tool adapts quickly.This allows the tool to achieve predictably timed queries for each table, and for the server overall. If this option is set to zero, the chunk size doesn’t auto-adjust, so query checksum times will vary, but query checksum sizes will not. Another way to do the same thing is to specify a value for--chunk-size explicitly,instead of leaving it at the default.  >>该参数默认值为0.5秒,pt-table-checksum通过控制chunk在--chunk-time指定时间内完成checksum来动态调整chunk大小。每个表进行checksum时,它都会跟踪checksum的速度(rows/second),它根据该表的上一个chunk的checksum速度来调整下一个chunk的大小。具体的chunk size算法如下:pt-table-checksum启动后第一个表的第一个chunk大小由默认的chunk-size值决定,之后每个表的第一个chunk大小根据pt-table-checksum启动以来checksum的平均速度来决定,以后该表的chunk大小根据该表上一个chunk的checksum速度来决定。pt-table-checksum能够迅速的根据服务器的负载来调整chunk大小。当然如果显示的指定了--chunk-size或者指定-chunk-time值为0,那么pt-table-checksum不会动态调整chunk的大小,所有chunk大小都是一致的,由(chunk-size决定)。


--columns
short form: -c; type: array; group: Filter
Checksum only this comma-separated list of columns. If a table doesn’t have any of the specified columns it will be skipped. This option applies to all tables, so it really only makes sense when checksumming one table unless the tables have a common set of columns.  >>可以通过逗号分隔指定多个列,如果表不包含--columns列中指定的任何一个,则跳过该表。该参数是对需要检查的所有表都生效的,所以该参数一般用在检查单个表,或者检查那些有相同表结构的表时。


--config
type: Array; group: Config
Read this comma-separated list of config files; if specified, this must be the first option on the command line. See the--helpoutput for a list of default config files.  >>可以把pt-table-checksum相关参数放在某个文件中,通过--config指定使用该控制文件(如果使用控制文件--config必须放在首位)


--[no]create-replicate-table
default: yes
Create the --replicate database and table if they do not exist. The structure of the replicate table is the same as the suggested table mentioned in--replicate.  >>如果用于存储检查数据的 数据库(percona)和表(checksum)不存在,则自动创建。指定--no-create-replicate-table可以禁用自动创建


--databases
short form: -d; type: hash; group: Filter
Only checksum this comma-separated list of databases.  >>只检查该参数指定的数据库,可以通过逗号分隔指定多个数据库。


--databases-regex
type: string; group: Filter
Only checksum databases whose names match this Perl regex.  >>功能同--databases,区别是支持使用正则表达式来匹配数据库名


--defaults-file
short form: -F; type: string; group: Connection
Only read mysql options from the given file. You must give an absolute pathname.  >>指定mysql配置文件


--[no]empty-replicate-table
default: yes
Delete previous checksums for each table before checksumming the table. This option does not truncate the entire table, it only deletes rows (checksums) for each table just before checksumming the table. Therefore, if checksumming stops prematurely and there was preexisting data, there will still be rows for tables that were not
checksummed before the tool was stopped. If you’re resuming from a previous checksum run, then the checksum records for the table from which the tool resumes won’t be emptied. To empty the entire replicate table, you must manually execute TRUNCATE TABLE beforerunning the tool.  >>默认情况下,在对某个表checksum之前会先删除该表上次检查的结果(pt.checksum表中与该表相关的记录)。注意只是在检查某个表之前只是删除该表上次的检查记录(如果存在的话),并不会删除其他表的相关记录。如果再次运行pt-table-checksum前,想清空整个checksum表,需要手动执行truncate命令。


--engines
short form: -e; type: hash; group: Filter
Only checksum tables which use these storage engines.  >>指定只检查某种存储引擎的表


--explain
cumulative: yes; default: 0; group: Output
Show, but do not execute, checksum queries (disables --[no]empty-replicate-table).If specified twice, the tool actually iterates through the chunking algorithm, printing the upper and lower boundary values for each chunk, but not executing the checksum queries.  >>查看执行计划,并不真正执行checksum


--float-precision
type: int
Precision for FLOAT and DOUBLE number-to-string conversion. Causes FLOAT and DOUBLE values to be rounded to the specified number of digits after the decimal point, with the ROUND() function in MySQL. This can help avoid checksum mismatches due to different floating-point representations of the same values on different MySQL versions and hardware. The default is no rounding; the values are converted to strings by the CONCAT() function, and MySQL chooses the string representation. If you specifya value of 2, for example, then the values 1.008 and 1.009 will be rounded to 1.01, and will checksum asequal.  >>指定float和double类型在numer-to-string转换时的精确度。如果指定该参数,在numer-to-string转换前根据--float-percision对number进行round()操作(即四舍五入)。默认情况下是不会进行round,而是通过concat()直接把number转换成string。但是在有些情况下,比如说因为硬件不同和mysql版本不同,导致相同的值使用不同的浮点数表示,这样就可能pt-table-checksum检查结果与实际结果不匹配。


--function
type: string
Hash function for checksums (FNV1A_64, MURMUR_HASH, SHA1, MD5, CRC32, etc). The default is to use CRC32(), but MD5() and SHA1() also work, and you can use your own function, such as a compiled UDF, if you wish. The function you specify is run in SQL, not in Perl, so it must be available to MySQL. MySQL doesn’t have good built-in hash functions that are fast. CRC32() is too prone to hash collisions, and MD5() and SHA1() are very CPU-intensive. The FNV1A_64() UDF that is distributed with Percona Server is a faster alternative. It is very simple to compile and install; look at theheader in the source code for instructions. If it is installed, it is preferred over MD5(). You can also use the MURMUR_HASH() function if you compile and install that as a UDF; the source is also distributed with Percona Server, an dit might be better than FNV1A_64().  >>指定checksum所使用的hash函数。默认使用CRC32(),但是你也可以指定MD5()和SHA1(),甚至你可以使用你自己写的函数(UDF,此处说的这些函数都是指mysql中的函数,而不是perl中的函数)。其实mysql并没有为我们提供一个很好的内置hash函数,CRC32()容易出现散列碰撞,MD5()和SHA1()又过于消耗cpu资源。percona 提供的FNV1A_64()是一个比较好的hash函数,编译和安装也很简单。percona提供的另一个hash函数MURMUR_HASH() 比FNV1A_64()更高效。


--help
group: Help
Show help and exit.  >>显示pt-table-checksum的帮助信息


--host
short form: -h; type: string; default: localhost; group: Connection
Host to connect to.  >>指定数据库实例的ip地址


--ignore-columns
type: Hash; group: Filter
Ignore this comma-separated list of columns when calculating the checksum. If atable has all of its columns filtered by –ignore-columns, it will be skipped.  >>检查时忽略某些列(checksum计算的时候不计算这些列),如果某个表的所有列,都在--ignore-columns指定的列表中,那么跳过该表


--ignore-databases
type: Hash; group: Filter
Ignore this comma-separated list of databases.  >>检查时忽略指定数据库下面的表,可以通过逗号分隔,指定忽略多个数据库

--ignore-databases-regex

type: string; group: Filter
Ignore databases whose names match this Perl regex.  >>作用同--ignore-databases,区别是支持使用正则表达式来匹配数据库名


--ignore-engines
type: Hash; default: FEDERATED,MRG_MyISAM; group: Filter
Ignore this comma-separated list of storage engines.  >>检查时忽略指定的某些存储引擎表


--ignore-tables
type: Hash; group: Filter
Ignore this comma-separated list of tables. Table names may be qualified with the database name. The --replicatetable is always automatically ignored.  >>检查时忽略某些指定的表,用来存储检查数据的表(pt.checksum)默认是被忽略的


--ignore-tables-regex
type: string; group: Filter
Ignore tables whose names match the Perl regex.  >>作用同--ignore-tables,区别是支持使用正则表达式来匹配表名


--max-lag
type: time; default: 1s; group: Throttle
Pause checksumming until all replicas’ lag is less than this value. After each checksum query (each chunk),pttable-checksum looks at the replication lag of all replicas to which it connects, using Seconds_Behind_Master. If any replica is lagging more than the value of this option, then pt-table-checksum will sleep for --check-interval seconds, then check all replicas again. If you specify--check-slave-lag, then the tool only examines that server for lag, not all servers. The tool waits forever for replicas to stop lagging. If any replica is stopped,the tool waits forever until the replica is started. Checksumming continues once all replicas are running and not lagging too much. The tool prints progress reports while waiting. If a replica is stopped, it prints a progress report immediately, then again at every progress report interval.
See also “REPLICA CHECKS”.  >>当有任意一个从库的延迟大于--max-lag指定值时,暂停检查。每完成一个chunk的checksum后,pt-table-checksum会检查所有从库的延时情况(Seconds_Behind_Master)。如果有任何一个从库的Seconds_Behind_Master值大于--max-lag,则pt-table-checksum休眠--check-interval秒后,再次检查所有从库的延时情况。可以通过--check-slave-lag指定只检查某个从库的延时情况,而不是检查所有从库。如果从库延迟一直大于--max-lag,那么pt-table-checksum会一直处于等待状态。如果从库停止了复制,那么pt-table-checksum会一直等待从库恢复复制。当所有从库复制线程正常,并且延迟小于--max-lag时,pt-table-checksum恢复检查工作。在遇到从库延时,或者从库复制线程异常时pt-table-checksum都会输出进程报告。


--max-load
type: Array; default: Threads_running=25; group: Throttle Examine SHOW GLOBAL STATUS after every chunk, and pause if any status variables are higher than the
threshold. The option accepts a comma-separated list of MySQL status variables to check for a threshold. An optional =MAX_VALUE (or :MAX_VALUE) can follow each variable. If not given,the tool determines a threshold by examining the current value and increasing it by 20%. For example, if you want the tool to pause when Threads_connected gets too high, you can specify “Threads_connected”, and the tool will check the current value when it starts working and add 20% to that value. If the current value is 100, then the tool will pause whenThreads_connected exceeds 120, and resume working when it is below 120 again. If you want to specify an explicit threshold, such as 110, you can use either “Threads_connected:110” or “Threads_connected=110”. The purpose of this option is to prevent the tool from adding too much load tothe server. If the checksum queries are intrusive, or if they cause lock waits, then other queries on the server will tend to block and queue. This will typically cause Threads_running to increase, and the tool can detectt hat by running SHOW GLOBAL STATUS immediately after each checksum query finishes. If you specify a threshold for this variable, then you can instruct the tool to wait until queries are running normally again. This will not prevent queueing, however; it will only give the server a chance to recover from the queueing. If you notice queueing, it is best to decrease the chunk time.  >>pt-table-checksum 每完成一个chunk的checksum后,都会检查mysql状态信息(show global status;) 如果某个状态值超过--max-load设定值,则暂停checksum操作。我们可以指定show global status;中的任意状态,并通过逗号分隔为--max-load指定多个状态值。我们可以如此指定状态参数Threads_running=25或者Threads_running:25或者Threads_running,如果没有为状态指定阈值,则以当前该状态值的120%作为该状态阈值。例如你指定--max-load=Threads_connected,Threads_connected当前状态值为100,那么当Threads_connected增加到120时,pt-table-checksum暂停checksum chunk,当Threads_connected值低于120时,继续checksum。你也可以显示的指定某个状态参数的阈值如Threads_connected=110或者Threads_connected:110. --max-load主要目的是为了防止pt-table-checksum运行造成数据库服务器过载。


--password
short form: -p; type: string; group: Connection
Password to use when connecting. If password contains commas they must beescaped with a backslash:  “exam,ple”   >>登录数据库的密码(如果密码中包含逗号,需要使用单引号)


--pid
type: string
Create the given PID file. The tool won’t start if the PID file already exists and the PID it contains is different than the current PID. However, if the PID file exists and the PID it containsis no longer running, the tool will overwrite the PID file with the current PID. The PID file is removed automaticallywhen the tool exits.  >>启动pt-table-checksum工具时会生成pid文件(用户记录该进程的pid号)可以通过--pid来指定该文件名。如果指定的文件已存在,并且文件中包含的pid和当前pid不一致,则pt-table-checksum不会启动。但是如果pid文件存在,但是文件中记录的进程已不存在,那么pt-table-checksum会使用当前的pid号覆盖写入该文件,并启动pt-table-checksum


--plugin
type: string
Perl module file that defines a pt_table_checksum_plugin class. A plugin allowsyou to write a Perl module that can hook into many parts ofpt-table-checksum. This requiresa good knowledge of Perl and Percona Toolkit conventions, which are beyond this scope of this documentation.Please contact Percona if you have questions or need help.
See “PLUGIN” for more information.  >>该参数基本用不到


--port
short form: -P; type: int; group: Connection
Port number to use for connection.  >>指定连接数据库实例的端口


--progress
type: array; default: time,30
Print progress reports to STDERR. The value is a comma-separated list with two parts. The first part can be percentage, time, or iterations; the second part specifies how often an update should be printed, in percentage,seconds, or number of iterations. The tool prints progress reports for a variety of time-consuming operations,including waiting for replicas to catch up if they become lagged.  >>在进行一些耗时操作时(如因从库延迟而导致的pt-table-checksum暂停)pt-table-checksum会打印进度报告到标准错误输出,该参数被逗号分为两部分,第一部分可以指定percentage,time或者iterations,第二部分可以指定输出的频率。


--quiet
short form: -q; cumulative: yes; default: 0
Print only the most important information (disables --progress). Specifying this option once causes the tool to print only errors, warnings, and tables that have checksum differences. Specifying this option twice causes the tool to print only errors. In this case, you can use the tool’s exit status to determine if there were any warnings or checksum differences.  >>该参数默认值为0(即禁用该参数),指定该参数即会禁用--progress。如果只指定该参数一次,那么pt-table-checksum只会输出errors,warnings以及那些数据不一致的表信息。如果指定该参数两次pt-table-checksum只输出errors,这时我们通过工具的退出状态值 来判断主从数据是否一致。


--recurse
type: int
Number of levels to recurse in the hierarchy when discovering replicas. Defaultis infinite. See also
--recursion-method and “REPLICA CHECKS”.  >>指定检查主从关系时,检查到第几层,默认是无限制,即级联主从也会变检查出来


--recursion-method
type: array; default: processlist,hosts
Preferred recursion method for discovering replicas. pt-table-checksum performs several “REPLICA CHECKS” before and while running. Although replicas are not required to run pt-table-checksum, the tool cannot detect diffs on slaves that it cannot discover. Therefore, a warning is printed and the “EXIT STATUS” is non-zero if no replicas are found and the method is not none. If this happens, try a different recursion method, or use the dsn method to specify the replicas to check.
Possible methods are:  >>指定pt-table-checksum用来检查主从关系的方法。pt-table-checksum开始检对表进行checksum前,会先对它发现的从库进行几项检查(该部分请见“REPLICA CHECKS”部分)。尽管从库不需要运行pt-table-checksum工具,但是如果pt-table-checksum没有发现某个从库,那么它就无法检查出该从库同主库数据不一致的情况。因此如果--recurese值为非none,但是没有发现从库,那么pt-table-checksum会输出warning信息,并且工具的退出状态为非0。 如果发生这种情况我们应该尝试为--recursion-method指定不同的值,或者直接使用dsn指定从库。该参数可以指定的值如下:

METHOD USES
=========== =============================================
processlist   SHOW PROCESSLIST  >>通过show processlist检查有哪些从库
hosts   SHOW SLAVE HOSTS      >>通过show slave hosts检查有哪些从库
cluster  SHOW STATUS LIKE 'wsrep\_incoming\_addresses'  >>通过show status like 来检查集群节点
dsn=DSN DSNs from a table  >>通过dsn指定从库
none Do not find slaves    >>不去检查是否有从库
The processlist method is the default, because SHOW SLAVE HOSTS is not reliable. However, if the server uses a non-standard port (not 3306), then the hosts method becomes the default because it works better in this case. The hosts method requires replicas to be configured with report_host, report_port,etc. The cluster method requires a cluster based on Galera 23.7.3 or newer, such as Percona XtraDB Cluster versions 5.5.29 and above. This will auto-discover nodesin a cluster using SHOW STATUS LIKE ’wsrep\_incoming\_addresses’. You can combine cluster with processlist and hoststo auto-discover cluster nodes and replicas, but this functionality is experimental. The dsn method is special: rather than automatically discovering replicas, this method specifies a table with replica DSNs. The tool will only connect to these replicas. This method works best when replicas do not use the same MySQL username or password as the master, or when you want to prevent the tool from connecting to certain replicas. The dsn method is specified like: --recursion-method dsn=h=host,D=percona,t=dsns. The specified DSN must have D and t parts, or just a  database qualified t part, which specify the DSN table. The DSN table must have the following structure:  >>因为show slave hosts并不可靠,所以默认通过processlist来检查从库。但是如果mysql服务使用的不是默认端口(3306),那么hosts变成该参数的默认值,因为在这种情况下它更可靠。注意如果你指定--recursion-method=hosts,那么需要从库配置report_host和report_port参数,否则show slave hosts;中无法体现从库的ip信息(当然我们也要注意report_host和report_port可能引发的问题)。关于cluster,先暂时不看。dsn方法是直接指定从库信息,而不是通过pt-table-checksum自动发现从库,指定一个包含从库信息的表。指定dsn时,pt-table-checksum只会连接到你指定的从库。当pt-table-checksum连接从库同连接主库用的是不同的用户名和密码时,或者为了防止工具连接某个从库时,你可以使用dns指定从库。dsn格式的指定方式如下:--recursion-method dsn=h=host,D=percona,t=dsns。dns指定必须包含D和t两部分或者指定t=percona.dsns。dsn表结构如下:
CREATE TABLE `dsns`(
`id` int(11)NOT NULL AUTO_INCREMENT,
`parent_id` int(11) DEFAULT NULL,
`dsn` varchar(255)NOT NULL,
PRIMARY KEY (`id`)
);
DSNs are ordered by id, but id and parent_id are otherwise ignored. The dsn column contains a replica DSN like it would be given on the command line, for example:
"h=replica_host,u=repl_user,p=repl_pass". The none method makes the tool ignore all slaves and cluster nodes. This method is not recommended because it effectively disables the “REPLICA CHECKS” and no differences can be found. It is useful, however, if you only need to write checksums on the master or a single cluster node. The safer alternative is --no-replicate-check: the tool finds replicas and cluster nodes, performs the“REPLICA CHECKS”, but does not check for differences. See--[no]replicate-check.  >>id和parent_id两个列不用在意,dsn列中存储着从库的连接信息如"h=replica_host,u=repl_user,p=repl_pass,P=3307"。如果指定该参数为none,则忽略所有的从库,和集群节点,这样的话实际是禁用了“REPLICA CHECKS”,也发现不了任何数据差异,所以不建议指定该参数为none。


--replicate
type: string; default: percona.checksums
Write checksum results to this table. The replicate table must have this structure(MAGIC_create_replicate):  >>指定checksum结果记录在哪个表中,默认值为percona.checksum,表结构如下:
CREATE TABLE checksums (
db CHAR(64NOT NULL,
tbl CHAR(64NOT NULL,
chunk INT NOT NULL,
chunk_time FLOAT NULL,
chunk_index VARCHAR(200NULL,
lower_boundary TEXT NULL,
upper_boundary TEXT NULL,
this_crc CHAR(40NOT NULL,
this_cnt INT NOT NULL,
master_crc CHAR(40NULL,
master_cnt INT NULL,
ts TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (db, tbl, chunk),
INDEX ts_db_tbl (ts, db, tbl)
) ENGINE=InnoDB;
Note: lower_boundary and upper_boundary data type can be BLOB. See --binary-index.  >>注意如果指定了--binary-index,那么lower_boundary和upper_boundary数据类型则变成blob。
By default, --[no]create-replicate-table is true, so the database and the table specified by this option are created automatically if they do not exist. Be sure to choose an appropriate storage engine for the replicate table. If you are checksumming InnoDB tables, and you use MyISAM for this table, a deadlock will break replication, because the mixture of transactional and non-transactional tables in the checksum statements will cause it to be written to the binlog even though it had an error. It will then replay without a deadlock on the replicas, and break replication with “different error on master and slave.” This is not a problem withpt-table-checksum; it’s aproblem with MySQL replication, and you can read more about it in the MySQL manual. The replicate table is never checksummed (the tool automatically adds thistable to--ignore-tables).  >>--[no]create-replicate-table的默认值是true,所以如果数据库pt,表checksum不存在,会自动创建。注意为checksum表选择合适的存储引擎。如果你检查innodb表,但是checksum表选择myisam存储引擎,从库会产生死锁,并破坏从库复制线程。因为checksum语句中既有事物表也有非事物表,即使在遇到错误时(此处为deadlock错误),还是会把记录写入binlog中。从库在执行binlog中同步过来的checksum语句


--[no]replicate-check
default: yes
Check replicas for data differences after finishing each table. The tool findsdifferences by executing a simple SELECT statement on all detected replicas. The query compares the replica’s checksum results to the master’s checksum results. It reports differences in the DIFFS column of the output. 

>>该参数默认值为yes,即完成每个表的checksum后,会检查该表主从数据是否一致。pt-table-checksum通过查看从库,对比从库上每个表的checksum值同主库上该表的checksum值是否一致来判断该表在主从库上的数据是否一致。


--replicate-check-only
Check replicas for consistency without executing checksum queries. This optionis used only with --[no]replicate-check. If specified,pt-table-checksumdoesn’t checksum any tables. It checks replicas for differences found by previous checksumming,and then exits. It might be useful if you run pt-tablechecksum quietly in a cron job, for example, and later want a report on the results of the cron job, perhaps to implement a Nagios check.

>>指定该参数时,只是检查上一次checksum的数据,并不对表进行checksum操作。如果你设置了定时任务,定时使用pt-table-checksum检查主从,你可以在定时任务执行之后使用pt-table-checksum 并指定--replicate-check-only来检查上次定时任务产生的数据。


--replicate-check-retries
type: int; default: 1
Retry checksum comparison this many times when a difference is encountered.Only when a difference persists after this number of checks is it considered valid. Using this option with a value of 2 or more alleviates spurious differences that arise when using the –resume option.

>>当发现主从某个chunk或者表数据不一致时,重复进行checksum并比对(重复的次数由--replicate-check-retries指定)。只有当重复多次检查后结果依然是不一致才认为该chunk或表主从数据不一致。当我们使用--resume参数继续某个停止的pt-table-checksum工作时,最好指定--replicate-check-retries为大于等于2的整数,以此了避免出现不准备的检查结果


--replicate-database
type: string
USE only this database. By default, pt-table-checksum executes USE to select the database that contains the table it’s currently working on. This is is a best effort to avoid problems with replication filters such as binlog_ignore_db and replicate_ignore_db. However, replication filters can create a situation where there simply is no one right way to do things. Some statements might not be replicated, and others might cause replication to fail. In such cases, you can use this option to specify a default database that pt-table-checksum selects with USE, and never changes. See also--[no]check-replication-filters

>>默认情况下在对某个表进行explain checksum或者checksum时,会使用use database命令先切换到该表所在数据库。但是我们知道*_do_db,*_ignore_db等复制过滤参数存在的问题,可能导致某些问题。我们可以通过--replicate-database指定pt-table-checksum工具在explain checksum和checksum时 use 到该参数指定的数据库,避免复制过滤参数的问题。


--resume Resume checksumming from the last completed chunk (disables--[no]empty-replicate-table).If the tool stops before it checksums all tables, this option makes checksumming resume from the last chunk of the last table that it finished.

>>接着上次最后一个完成checksum的chunk之后,继续对剩下chunk和表进行checksum(指定该参数会使--empty-replicate-table失效)。在pt-table-checksum没有检查完所有表就停止后,我们可以通过指定该参数让pt-table-checksum接着上次继续处理。


--retries
type: int; default: 2
Retry a chunk this many times when there is a nonfatal error. Nonfatal errors are problems such as a lock wait timeout or the query being killed.  

>>pt-table-checksum遇到非致命错误后重试的次数,默认值为2。lock timeout或者query 被杀掉都属于非致命错误。

--run-time

type: time
How long to run. Default is to run until all tables have been checksummed.These time value suffixes are allowed: s (seconds), m (minutes), h (hours), and d (days). Combine this option with--resume to checksum as many tables within an allotted time, resuming from where the tool left off next time it is ran.

>>该参数指定pt-table-checksum工具运行时间,单位可以是s,m,h,d。默认情况下直到所有表被检查完后,该工具结束工作。如果要检查表的数据巨大,我们可以结合使用--run-time和--resume,一阶段检查一部分表,下一阶段检查剩下的表。


--separator
type: string; default: #
The separator character used for CONCAT_WS(). This character is used to join the values of columns when checksumming.

>>指定CONCAT_WS()分隔符。pt-table-checksum 一条记录中所有列值使用CONCAT_WS函数拼接起来,然后进行checksum计算。


--set-vars
type: Array; group: Connection
Set the MySQL variables in this comma-separated list of variable=value pairs.By default, the tool sets:
wait_timeout=10000
innodb_lock_wait_timeout=1
Variables specified on the command line override these defaults. For example, specifying --set-vars wait_timeout=500 overrides the default value of 10000. The tool prints a warning and continues if a variable cannot be set.

>>通过该参数指定pt-table-checksum会话级别的参数,默认情况下pt-table-checksum会设置wait_timeout=10000,innodb_lock_wait_timeout=1。如果所指定的参数无法设置,pt-table-checksum会输出warning,然后继续工作。


--socket
short form: -S; type: string; group: Connection
Socket file to use for connection.

>>指定需要连接的mysql实例的socket


--tables
short form: -t; type: hash; group: Filter
Checksum only this comma-separated list of tables. Table names may be qualified with the database name.

>>只对--tables指定的表进行检查,可以通过逗号分隔指定多个表。可以数据库名和表名一起指定如: database1.table1


--tables-regex
type: string; group: Filter
Checksum only tables whose names match this Perl regex.

>>功能同--tables,区别是可以通过正则表达式来匹配表名


--trim
Add TRIM() to VARCHAR columns (helps when comparing 4.1 to >= 5.0). This is useful when you don’t care about the trailing space differences between MySQL versions that vary in their handling of trailing spaces. MySQL 5.0 and later all retain trailing spaces in VARCHAR, while previous versions would remove them. These differences will cause false checksum differences.

>>指定该参数,在比较时会去掉varchar列的前后空格(对于比较4.1 和5.0及更高版本的库时很有帮助)。在mysql 5.0及更高版本中 varchar尾部的空格会被保留,但是在之前的版本中会被删除。所以如果我们比较的是4.1版本数据库和高于5.0版本数据库,最好指定--trim参数(如果你认为尾部是否有空格不是差异的情况下)


--user
short form: -u; type: string; group: Connection
User for login if not current user.

>>指定连接数据库实例的用户名


--version
group: Help
Show version and exit.

>>输出工具版本信息,并退出。


--[no]version-check
default: yes
Check for the latest version of Percona Toolkit, MySQL, and other programs. This is a standard “check for updates automatically” feature, with two additional features. First, the tool checks the version of other programs on the local system in addition to its own version. For example, it checks the version of every MySQL server it connects to, Perl, and the Perl module DBD::mysql. Second, it checks for and warns about versions with known problems. For example, MySQL 5.5.25 had acritical bug and was re-released as 5.5.25a. Any updates or known problems are printed to STDOUT before the tool’s normaloutput. This feature should never interfere with the normal operation of the tool. For more information, visithttps://www.percona.com/version-check.

>>


--where
type: string
Do only rows matching this WHERE clause. You can use this option to limit thechecksum to only part of the table. This is particularly useful if you have append-only tables and don’t want to constantly re-check all rows; you could run a daily job to just check yesterday’s rows, for instance. This option is much like the -w option to mysqldump. Do not specify the WHERE keyword. You might need to quote the value. Here is an example:
:program:`pt-table-checksum`--where"ts > CURRENT_DATE - INTERVAL 1 DAY"

>>只检查匹配相应where条件的记录。where条件需要放在括号中


2.27.10 REPLICA CHECKS
By default, pt-table-checksum attempts to find and connect to all replicas connected to the master host. This automated process is called “slave recursion” and is controlled by the--recursion-method and--recurse options. The tool performs these checks on all replicas:


1. --[no]check-replication-filters
pt-table-checksum checks for replication filters on all replicas because they can complicate or break the checksum process. By default, the tool will exit if any replication filters are found, but this check can be disabled by specifying --no-check-replication-filters.  >>pt-table-checksum在执行checksum操作之前,会先检查所有从库是否配置了复制过滤,如果检查到有任何一个从库配置了复制过滤,默认pt-table-checksum会自动退出。但是我们可以指定--no-check-replication-filters,不检查复制过程。
2. --replicate table
pt-table-checksum checks that the --replicate table exists on all replicas, else checksumming can break replication when updates to the table on the master replicate to areplica that doesn’t have the table. This check cannot be disabled, and the tool wait forever until the table exists on all replicas, printing --progressmessages while it waits.  >>在执行checksum操作执行pt-table-checksum会先检查是否所有从库上都存在checksum表(如果主库不存在checkusm表,会自动创建该表),如果主库在向checksum表中写入checksum数据,但是从库中却没有该表,则会从库复制线程出错。该检查无法被禁止,pt-table-checksum会一直等待,直到所有的从库上都存在checksum表,在等待的过程中会输出进度报告。
3. Single chunk size
If a table can be checksummed in a single chunk on the master, pt-table-checksum will check that the table size on all replicas is approximately the same. This prevents a rareproblem where the table on the master is empty or small, but on a replica it is much larger. In this case, the single chunk checksum on the master would overload the replica. This check cannot be disabled.  >>如果一个表在主库被当做一个chunk进行checksum(较小的表),那么pt-table-checksum会检查所有从库上该表的大小是否同主库一致。因为可能存在这样一种存库,某个表在主库很小或者是空表,但是在从库上确很大。这个主库会把该表作为一个chunk进行checksum,当从库replay主库传递过来的binlog时也会把从库上的该表作为一个chunk进行checksum,这样可能会导致从库过载。这个检查项也是无法被禁止的
4. Lag
After each chunk, pt-table-checksum checks the lag on all replicas, or only the replica specified by --check-slave-lag. This helps the tool not to overload the replicas with checksum data. There is no way to disable this check, but you can specify a single replica to check with --check-slave-lag, and if that replica is the fastest, it will help prevent the tool from waiting too long for replica lag to abate.  >>每完成一个chunk的checksum,pt-table-checksum会检查所有从库的延时,如果有任何一个从库延迟大于--max-lag,暂停checksum。如果指定了--check-slave-lag,那么只检查该参数的延时。该检查项无法禁止
5. Checksum chunks
When pt-table-checksum finishes checksumming a table, it waits for the last checksum chunk to replicate to all replicas so it can perform the--[no]replicate-check. Disabling that option by specifying –no-replicate-check disables this check, but it also disables immediate reporting of checksum differences, thereby requiring a second run of the tool with--replicate-check-only to find and print checksum differences.  >>pt-table-checksum每完成一个表的checksum操作,会等待所有从库上该表的checksum操作完成,然后对比主从上检查结果。我们可以通过指定--no-replicate-check,只对表进行checksum,不立即检查,在checksum完成后通过指定--replicate-check-only来查看上次的检查结果


2.27.11 PLUGIN
The file specified by --plugin must define a class (i.e. a package) called pt_table_checksum_plugin with
a new() subroutine. The tool will create an instance of this class and call anyhooks that it defines. No hooks are
required, but a plugin isn’t very useful without them.
These hooks, in this order, are called if defined:
init
before_replicate_check
after_replicate_check
get_slave_lag
before_checksum_table
after_checksum_table
Each hook is passed different arguments. To see which arguments arepassed to a hook, search for the hook’s name in
the tool’s source code, like:
# --plugin hook
if ($plugin&&$plugin->can('init') ) {
$plugin->init(
slaves =>$slaves,
slave_lag_cxns =>$slave_lag_cxns,
repl_table =>$repl_table,
);
}
The comment # --plugin hook precedes every hook call.
Please contact Percona if you have questions or need help.


2.27.12 DSN OPTIONS
These DSN options are used to create a DSN. Each option is given like option=value.The options are casesensitive, so P and p are not the same option. There cannotbe whitespace before or after the = and if the value
contains whitespace it must be quoted. DSN options are comma-separated. See thepercona-toolkit manpage for full
details.
• A
dsn: charset; copy: yes
Default character set.
• D
copy: no
DSN table database.
• F
dsn: mysql_read_default_file; copy: yes
Defaults file for connection values.
• h
dsn: host; copy: yes
Connect to host.
• p
dsn: password; copy: yes
Password to use when connecting. If password contains commas they must beescaped with a backslash:
“exam,ple”
• P
dsn: port; copy: yes
Port number to use for connection.
• S
dsn: mysql_socket; copy: no
Socket file to use for connection.
• t
copy: no
DSN table table.
• u
dsn: user; copy: yes
User for login if not current user.


2.27.13 ENVIRONMENT
The environment variable PTDEBUG enables verbose debugging output toSTDERR. To enable debugging and capture
all output to a file, run the tool like:
PTDEBUG=1pt-table-checksum ... > FILE 2>&1
Be careful: debugging output is voluminous and can generate severalmegabytes of output.
2.27.14 SYSTEM REQUIREMENTS
You need Perl, DBI, DBD::mysql, and some core packages that ought to beinstalled in any reasonably new version of
Perl.
2.27.15 BUGS
For a list of known bugs, see http://www.percona.com/bugs/pt-table-checksum.
Please report bugs at https://bugs.launchpad.net/percona-toolkit. Include the following information in your bug report:
• Complete command-line used to run the tool
• Tool --version
• MySQL version of all servers involved
• Output from the tool including STDERR
• Input files (log/dump/config files, etc.)
If possible, include debugging output by running the tool with PTDEBUG; see“ENVIRONMENT”.


2.27.16 DOWNLOADING
Visit http://www.percona.com/software/percona-toolkit/to download the latest release of Percona Toolkit. Or, get the
latest release from the command line:
wget percona.com/get/percona-toolkit.tar.gz
wget percona.com/get/percona-toolkit.rpm
wget percona.com/get/percona-toolkit.deb
You can also get individual tools from the latest release:
wget percona.com/get/TOOL
Replace TOOL with the name of any tool.
2.27.17 AUTHORS
Baron Schwartz and Daniel Nichter
2.27.18 ACKNOWLEDGMENTS
Claus Jeppesen, Francois Saint-Jacques, Giuseppe Maxia, Heikki Tuuri,James Briggs, Martin Friebe, and Sergey
Zhuravlev
2.27.19 ABOUT PERCONA TOOLKIT
This tool is part of Percona Toolkit, a collection of advancedcommand-line tools for MySQL developed by Percona.
Percona Toolkit was forked from two projects in June, 2011: Maatkit andAspersa. Those projects were created by
Baron Schwartz and primarily developed by him and Daniel Nichter. Visit http://www.percona.com/software/ tolearn
about other free, open-source software from Percona.
2.27.20 COPYRIGHT, LICENSE, AND WARRANTY
This program is copyright 2011-2015 Percona LLC and/or its affiliates,2007-2011 Baron Schwartz.
THIS PROGRAM IS PROVIDED “AS IS” AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES,INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY ANDFITNESS
FOR A PARTICULAR PURPOSE.
This program is free software; you can redistribute it and/or modify it underthe terms of the GNU General Public
License as published by the Free Software Foundation, version 2; OR the PerlArtistic License. On UNIX and similar
systems, you can issue ‘man perlgpl’ or ‘man perlartistic’ to read theselicenses.
You should have received a copy of the GNU General Public License along withthis program; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307USA.
2.27.21 VERSION
pt-table-checksum 2.2.15



相关链接:

pt-table-sync 中文使用说明
http://blog.csdn.net/shaochenshuo/article/details/53285439


pt-table-sync 使用方法
http://blog.csdn.net/shaochenshuo/article/details/56009234


pt-table-checksum 使用方法
http://blog.csdn.net/shaochenshuo/article/details/56009092

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值