lcp mysql cluster_MySQL Cluster 7.3.6 Released_MySQL

沐拉

于 2021-02-17 06:28:02 发布

阅读量90

点赞数

文章标签： lcp mysql cluster

本文链接：https://blog.csdn.net/weixin_42413377/article/details/114319737

版权

The binary and source versions of MySQL Cluster 7.3.6 have now been made available at http://www.mysql.com/downloads/cluster/ .

Release notes

MySQL Cluster NDB 7.3.6 is a new release of MySQL Cluster, based

on MySQL Server 5.6 and including features from version 7.3 of the

NDB storage engine, as well as fixing a number of recently

discovered bugs in previous MySQL Cluster releases.

Obtaining MySQL Cluster NDB 7.3. MySQL Cluster NDB 7.3 source

code and binaries can be obtained from

http://dev.mysql.com/downloads/cluster/ .

For an overview of changes made in MySQL Cluster NDB 7.3, see

MySQL Cluster Development in MySQL Cluster NDB 7.3

( http://dev.mysql.com/doc/refman/5.6/en/mysql-cluster-development-5-6-ndb-7-3.html ).

This release also incorporates all bugfixes and changes made in

previous MySQL Cluster releases, as well as all bugfixes and

feature changes which were added in mainline MySQL 5.6 through

MySQL 5.6.19 (see Changes in MySQL 5.6.19 (2014-05-30)

( http://dev.mysql.com/doc/relnotes/mysql/5.6/en/news-5-6-19.html )).

Functionality Added or ChangedCluster API: Added as an aid to debugging the ability to

specify a human-readable name for a given Ndb object and later

to retrieve it. These operations are implemented,

respectively, as the setNdbObjectName() and getNdbObjectName()

methods.

To make tracing of event handling between a user application

and NDB easier, you can use the reference (from getReference()

followed by the name (if provided) in printouts; the reference

ties together the application Ndb object, the event buffer,

and the NDB storage engine’s SUMA block. (Bug #18419907)

Bugs FixedCluster API: When two tables had different foreign keys with

the same name, ndb_restore considered this a name conflict and

failed to restore the schema. As a result of this fix, a slash

character (/) is now expressly disallowed in foreign key

names, and the naming format parent_id/child_id/fk_name is now

enforced by the NDB API. (Bug #18824753)

Processing a NODE_FAILREP signal that contained an invalid

node ID could cause a data node to fail. (Bug #18993037, Bug

#73015)

References: This bug is a regression of Bug #16007980.

When building out of source, some files were written to the

source directory instead of the build dir. These included the

manifest.mf files used for creating ClusterJ jars and the

pom.xml file used by mvn_install_ndbjtie.sh. In addition,

ndbinfo.sql was written to the build directory, but marked as

output to the source directory in CMakeLists.txt. (Bug

#18889568, Bug #72843)

Adding a foreign key failed with NDB Error 208 if the parent

index was parent table’s primary key, the primary key was not

on the table’s initial attributes, and the child table was not

empty. (Bug #18825966)

When an NDB table served as both the parent table and a child

table for 2 different foreign keys having the same name,

dropping the foreign key on the child table could cause the

foreign key on the parent table to be dropped instead, leading

to a situation in which it was impossible to drop the

remaining foreign key. This situation can be modelled using

the following CREATE TABLE statements:

CREATE TABLE parent (

id INT NOT NULL,

PRIMARY KEY (id)

) ENGINE=NDB;

CREATE TABLE child (

id INT NOT NULL,

parent_id INT,

PRIMARY KEY (id),

INDEX par_ind (parent_id),

FOREIGN KEY (parent_id)

REFERENCES parent(id)

) ENGINE=NDB;

CREATE TABLE grandchild (

id INT,

parent_id INT,

INDEX par_ind (parent_id),

FOREIGN KEY (parent_id)

REFERENCES child(id)

) ENGINE=NDB;

With the tables created as just shown, the issue occured when

executing the statement ALTER TABLE child DROP FOREIGN KEY

parent_id, because it was possible in some cases for NDB to

drop the foreign key from the grandchild table instead. When

this happened, any subsequent attempt to drop the foreign key

from either the child or from the grandchild table failed.

(Bug #18662582)

ndbmtd supports multiple parallel receiver threads, each of

which performs signal reception for a subset of the remote

node connections (transporters) with the mapping of

remote_nodes to receiver threads decided at node startup.

Connection control is managed by the multi-instance TRPMAN

block, which is organized as a proxy and workers, and each

receiver thread has a TRPMAN worker running locally.

The QMGR block sends signals to TRPMAN to enable and disable

communications with remote nodes. These signals are sent to

the TRPMAN proxy, which forwards them to the workers. The

workers themselves decide whether to act on signals, based on

the set of remote nodes they manage.

The current isuue arises because the mechanism used by the

TRPMAN workers for determining which connections they are

responsible for was implemented in such a way that each worker

thought it was responsible for all connections. This resulted

in the TRPMAN actions for OPEN_COMORD, ENABLE_COMREQ, and

CLOSE_COMREQ being processed multiple times.

The fix keeps TRPMAN instances (receiver threads) executing

OPEN_COMORD, ENABLE_COMREQ and CLOSE_COMREQ requests. In

addition, the correct TRPMAN instance is now chosen when

routing from this instance for a specific remote connection.

(Bug #18518037)

Executing ALTER TABLE … REORGANIZE PARTITION after

increasing the number of data nodes in the cluster from 4 to

16 led to a crash of the data nodes. This issue was shown to

be a regression caused by previous fix which added a new dump

handler using a dump code that was already in use (7019),

which caused the command to execute two different handlers

with different semantics. The new handler was assigned a new

DUMP code (7024). (Bug #18550318)

References: This bug is a regression of Bug #14220269.

When running with a very slow main thread, and one or more

transaction coordinator threads, on different CPUs, it was

possible to encounter a timeout when sending a

DIH_SCAN_GET_NODESREQ signal, which could lead to a crash of

the data node. Now in such cases the timeout is avoided. (Bug

#18449222)

During data node failure handling, the transaction coordinator

performing takeover gathers all known state information for

any failed TC instance transactions, determines whether each

transaction has been committed or aborted, and informs any

involved API nodes so that they can report this accurately to

their clients. The TC instance provides this information by

sending TCKEY_FAILREF or TCKEY_FAILCONF signals to the API

nodes as appropriate top each affected transaction.

In the event that this TC instance does not have a direct

connection to the API node, it attempts to deliver the signal

by routing it through another data node in the same node group

as the failing TC, and sends a GSN_TCKEY_FAILREFCONF_R signal

to TC block instance 0 in that data node. A problem arose in

the case of multiple transaction cooridnators, when this TC

instance did not have a signal handler for such signals, which

led it to fail.

This issue has been corrected by adding a handler to the TC

proxy block which in such cases forwards the signal to one of

the local TC worker instances, which in turn attempts to

forward the signal on to the API node. (Bug #18455971)

A local checkpoint (LCP) is tracked using a global LCP state

(c_lcpState), and each NDB table has a status indicator which

indicates the LCP status of that table (tabLcpStatus). If the

global LCP state is LCP_STATUS_IDLE, then all the tables

should have an LCP status of TLS_COMPLETED.

When an LCP starts, the global LCP status is LCP_INIT_TABLES

and the thread starts setting all the NDB tables to

TLS_ACTIVE. If any tables are not ready for LCP, the LCP

initialization procedure continues with CONTINUEB signals

until all tables have become available and been marked

TLS_ACTIVE. When this initialization is complete, the global

LCP status is set to LCP_STATUS_ACTIVE.

This bug occurred when the following conditions were met:An LCP was in the LCP_INIT_TABLES state, and some but not

all tables had been set to TLS_ACTIVE.

The master node failed before the global LCP state

changed to LCP_STATUS_ACTIVE; that is, before the LCP

could finish processing all tables.

The NODE_FAILREP signal resulting from the node failure

was processed before the final CONTINUEB signal from the

LCP initialization process, so that the node failure was

processed while the LCP remained in the LCP_INIT_TABLES

state.

Following master node failure and selection of a new one, the

new master queries the remaining nodes with a MASTER_LCPREQ

signal to determine the state of the LCP. At this point, since

the LCP status was LCP_INIT_TABLES, the LCP status was reset

to LCP_STATUS_IDLE. However, the LCP status of the tables was

not modified, so there remained tables with TLS_ACTIVE.

Afterwards, the failed node is removed from the LCP. If the

LCP status of a given table is TLS_ACTIVE, there is a check

that the global LCP status is not LCP_STATUS_IDLE; this check

failed and caused the data node to fail.

Now the MASTER_LCPREQ handler ensures that the tabLcpStatus

for all tables is updated to TLS_COMPLETED when the global LCP

status is changed to LCP_STATUS_IDLE. (Bug #18044717)

When performing a copying ALTER TABLE operation, mysqld

creates a new copy of the table to be altered. This

intermediate table, which is given a name bearing the prefix

#sql-, has an updated schema but contains no data. mysqld then

copies the data from the original table to this intermediate

table, drops the original table, and finally renames the

intermediate table with the name of the original table.

mysqld regards such a table as a temporary table and does not

include it in the output from SHOW TABLES; mysqldump also

ignores an intermediate table. However, NDB sees no difference

between such an intermediate table and any other table. This

difference in how intermediate tables are viewed by mysqld

(and MySQL client programs) and by the NDB storage engine can

give rise to problems when performing a backup and restore if

an intermediate table existed in NDB, possibly left over from

a failed ALTER TABLE that used copying. If a schema backup is

performed using mysqldump and the mysql client, this table is

not included. However, in the case where a data backup was

done using the ndb_mgm client’s BACKUP command, the

intermediate table was included, and was also included by

ndb_restore, which then failed due to attempting to load data

into a table which was not defined in the backed up schema.

To prevent such failures from occurring, ndb_restore now by

default ignores intermediate tables created during ALTER TABLE

operations (that is, tables whose names begin with the prefix

#sql-). A new option –exclude-intermediate-sql-tables is

added that makes it possible to override the new behavior. The

option’s default value is TRUE; to cause ndb_restore to revert

to the old behavior and to attempt to restore intermediate

tables, set this option to FALSE. (Bug #17882305)

The logging of insert failures has been improved. This is

intended to help diagnose occasional issues seen when writing

to the mysql.ndb_binlog_index table. (Bug #17461625)

The DEFINER column in the INFORMATION_SCHEMA.VIEWS table

contained erroneous values for views contained in the ndbinfo

information database. This could be seen in the result of a

query such as SELECT TABLE_NAME, DEFINER FROM

INFORMATION_SCHEMA.VIEWS WHERE TABLE_SCHEMA=’ndbinfo’. (Bug

#17018500)

Employing a CHAR column that used the UTF8 character set as a

table’s primary key column led to node failure when restarting

data nodes. Attempting to restore a table with such a primary

key also caused ndb_restore to fail. (Bug #16895311, Bug

#68893)

Disk Data: Setting the undo buffer size used by

InitialLogFileGroup to a value greater than that set by

SharedGlobalMemory prevented data nodes from starting; the

data nodes failed with Error 1504 Out of logbuffer memory.

While the failure itself is expected behavior, the error

message did not provide sufficient information to diagnose the

actual source of the problem; now in such cases, a more

specific error message Out of logbuffer memory (specify

smaller undo_buffer_size or increase SharedGlobalMemory) is

supplied. (Bug #11762867, Bug #55515)

Cluster Replication: When using NDB$EPOCH_TRANS, conflicts

between DELETE operations were handled like conflicts between

updates, with the primary rejecting the transaction and

dependents, and realigning the secondary. This meant that

their behavior with regard to subsequent operations on any

affected row or rows depended on whether they were in the same

epoch or a different one: within the same epoch, they were

considered conflicting events; in different epochs, they were

not considered in conflict.

This fix brings the handling of conflicts between deletes by

NDB$EPOCH_TRANS with that performed when using NDB$EPOCH for

conflict detection and resolution, and extends testing with

NDB$EPOCH and NDB$EPOCH_TRANS to include “delete-delete”

conflicts, and encapsulate the expected result, with

transactional conflict handling modified so that a conflict

between DELETE operations alone is not sufficient to cause a

transaction to be considered in conflict. (Bug #18459944)

Cluster API: When an NDB data node indicates a buffer overflow

via an empty epoch, the event buffer places an inconsistent

data event in the event queue. When this was consumed, it was

not removed from the event queue as expected, causing

subsequent nextEvent() calls to return 0. This caused event

consumption to stall because the inconsistency remained

flagged forever, while event data accumulated in the queue.

Event data belonging to an empty inconsistent epoch can be

found either at the beginning or somewhere in the middle.

pollEvents() returns 0 for the first case. This fix handles

the second case: calling nextEvent() call dequeues the

inconsistent event before it returns. In order to benefit from

this fix, user applications must call nextEvent() even when

pollEvents() returns 0. (Bug #18716991)

Cluster API: The pollEvents() method returned 1, even when

called with a wait time equal to 0, and there were no events

waiting in the queue. Now in such cases it returns 0 as

expected. (Bug #18703871)

lcp mysql cluster_MySQL Cluster 7.3.6 Released_MySQL

“相关推荐”对你有帮助么？