I am using MySQL version 5.5.14 to run the following query from a table of 5 Million rows:
SELECT P.ID, P.Type, P.Name, P.cty
, X(P.latlng) as 'lat', Y(P.latlng) as 'lng'
, P.cur, P.ak, P.tn, P.St, P.Tm, P.flA, P.ldA, P.flN
, P.lv, P.bd, P.bt, P.nb
, P.ak * E.usD as 'usP'
FROM PIG P
INNER JOIN EEL E
ON E.cur = P.cur
WHERE act='1'
AND flA >= '1615'
AND ldA >= '0'
AND yr >= (YEAR(NOW()) - 100)
AND lv >= '0'
AND bd >= '3'
AND bt >= '2'
AND nb <= '5'
AND cDate >= NOW()
AND MBRContains(LineString( Point(-65.6583, -87.8906)
, Point(65.6583, 87.8906)
), latlng)
AND Type = 'g'
AND tn = 'l'
AND St + Tm - YEAR(NOW()) >= '30'
HAVING usP BETWEEN 300/2 AND 300 LIMIT 100;
The table definitions are:
CREATE TABLE `PIG` (
`ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`Email` char(50) NOT NULL,
`Type` char(1) NOT NULL,
`Name` char(25) DEFAULT NULL,
`cty` char(2) DEFAULT NULL,
`latlng` point NOT NULL,
`tn` char(1) NOT NULL DEFAULT 'l',
`St` smallint(4) unsigned NOT NULL DEFAULT '0',
`Tm` smallint(3) unsigned NOT NULL DEFAULT '0',
`yr` smallint(4) unsigned NOT NULL DEFAULT '0',
`flA` mediumint(6) unsigned NOT NULL DEFAULT '0',
`ldA` mediumint(6) unsigned NOT NULL DEFAULT '0',
`flN` smallint(3) unsigned NOT NULL DEFAULT '1',
`lv` smallint(3) unsigned NOT NULL DEFAULT '0',
`bd` tinyint(2) unsigned NOT NULL DEFAULT '0',
`bt` tinyint(2) unsigned NOT NULL DEFAULT '0',
`nb` tinyint(1) unsigned NOT NULL DEFAULT '9',
`cur` char(3) DEFAULT NULL,
`ak` int(10) unsigned NOT NULL DEFAULT '0',
`Des` tinytext,
`pDate` datetime DEFAULT NULL,
`cDate` date DEFAULT NULL,
`act` tinyint(1) unsigned NOT NULL DEFAULT '0',
`bid` tinyint(3) unsigned NOT NULL DEFAULT '0',
`ab` tinyint(3) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`ID`),
KEY `id_ca` (`cty`,`ak`),
SPATIAL KEY `id_latlng` (`latlng`)
) ENGINE=MyISAM AUTO_INCREMENT=5000001 DEFAULT CHARSET=latin1
And:
CREATE TABLE `EEL` (
`cur` char(3) NOT NULL,
`usD` decimal(11,10) NOT NULL,
PRIMARY KEY (`cur`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
The following shows the query execution plan:
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: P
type: range
possible_keys: id_latlng
key: id_latlng
key_len: 34
ref: NULL
rows: 742873
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: E
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 3
ref: BS.P.cur
rows: 1
Extra:
This query does not use query cache due to the presence of NOW() function. From my previous posting, I discovered that other forms of cache exist to speed up the query from initial time of 300s down to less than 2s. My question is: "how does one improve the above query time, knowing that the cache won't be of much use since the search criteria for latlng is constantly changing?" Note that a spatial index on latlng has already been built for optimisation purpose.
Cheers, Ben
解决方案
Good indexes are the ones with high selectivity. Your conditions are mostly range conditions and this poses a limit on the fields that can be used in a composite index.
Possible indexes to investigate (composed from those fields that have an equality check with the addition in the end, of one field with a range check):
(act, Type, tn, flA)
(act, Type, tn, cDate)
(act, Type, tn, nb)
To check selectivity without creating indexes, you could use:
SELECT COUNT(*)
FROM PIG P
WHERE act='1'
AND Type = 'g'
AND tn = 'l'
AND flA >= '1615'
and
SELECT COUNT(*)
FROM PIG P
WHERE act='1'
AND Type = 'g'
AND tn = 'l'
AND cDate >= NOW()
and
SELECT COUNT(*)
FROM PIG P
WHERE act='1'
AND Type = 'g'
AND tn = 'l'
AND nb <= '5'
and compare the output with the 742873 you have from the spatial index.