sqlite 复制表速度慢,SQLite：COUNT在大表上慢

最新推荐文章于 2023-03-20 08:28:50 发布

Zhaoyang Wang

最新推荐文章于 2023-03-20 08:28:50 发布

阅读量268

点赞数

文章标签： sqlite 复制表速度慢

I'm having a performance problem in SQLite with a SELECT COUNT(*) on a large tables.

As I didn't yet receive a usable answer and I did some further testing, I edited my question to incorporate my new findings.

I have 2 tables:

CREATE TABLE Table1 (

Key INTEGER NOT NULL,

... several other fields ...,

Status CHAR(1) NOT NULL,

Selection VARCHAR NULL,

CONSTRAINT PK_Table1 PRIMARY KEY (Key ASC))

CREATE Table2 (

Key INTEGER NOT NULL,

Key2 INTEGER NOT NULL,

... a few other fields ...,

CONSTRAINT PK_Table2 PRIMARY KEY (Key ASC, Key2 ASC))

Table1 has around 8 million records and Table2 has around 51 million records, and the databasefile is over 5GB.

Table1 has 2 more indexes:

CREATE INDEX IDX_Table1_Status ON Table1 (Status ASC, Key ASC)

CREATE INDEX IDX_Table1_Selection ON Table1 (Selection ASC, Key ASC)

"Status" is required field, but has only 6 distinct values, "Selection" is not required and has only around 1.5 million values different from null and only around 600k distinct values.

I did some tests on both tables, you can see the timings below, and I added the "explain query plan" for each request (QP). I placed the database file on an USB-memorystick so i could remove it after each test and get reliable results without interference of the disk cache. Some requests are faster on USB (I suppose due to lack of seektime), but some are slower (table scans).

SELECT COUNT(*) FROM Table1

Time: 105 sec

QP: SCAN TABLE Table1 USING COVERING INDEX IDX_Table1_Selection(~1000000 rows)

SELECT COUNT(Key) FROM Table1

Time: 153 sec

QP: SCAN TABLE Table1 (~1000000 rows)

SELECT * FROM Table1 WHERE Key = 5123456

Time: 5 ms

QP: SEARCH TABLE Table1 USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)

SELECT * FROM Table1 WHERE Status = 73 AND Key > 5123456 LIMIT 1

Time: 16 sec

QP: SEARCH TABLE Table1 USING INDEX IDX_Table1_Status (Status=?) (~3 rows)

SELECT * FROM Table1 WHERE Selection = 'SomeValue' AND Key > 5123456 LIMIT 1

Time: 9 ms

QP: SEARCH TABLE Table1 USING INDEX IDX_Table1_Selection (Selection=?) (~3 rows)

As you can see the counts are very slow, but normal selects are fast (except for the 2nd one, which took 16 seconds).

The same goes for Table2:

SELECT COUNT(*) FROM Table2

Time: 528 sec

QP: SCAN TABLE Table2 USING COVERING INDEX sqlite_autoindex_Table2_1(~1000000 rows)

SELECT COUNT(Key) FROM Table2

Time: 249 sec

QP: SCAN TABLE Table2 (~1000000 rows)

SELECT * FROM Table2 WHERE Key = 5123456 AND Key2 = 0

Time: 7 ms

QP: SEARCH TABLE Table2 USING INDEX sqlite_autoindex_Table2_1 (Key=? AND Key2=?) (~1 rows)

Why is SQLite not using the automatically created index on the primary key on Table1 ?

And why, when he uses the auto-index on Table2, it still takes a lot of time ?

I created the same tables with the same content and indexes on SQL Server 2008 R2 and there the counts are nearly instantaneous.

One of the comments below suggested executing ANALYZE on the database. I did and it took 11 minutes to complete.

After that, I ran some of the tests again:

SELECT COUNT(*) FROM Table1

Time: 104 sec

QP: SCAN TABLE Table1 USING COVERING INDEX IDX_Table1_Selection(~7848023 rows)

SELECT COUNT(Key) FROM Table1

Time: 151 sec

QP: SCAN TABLE Table1 (~7848023 rows)

SELECT * FROM Table1 WHERE Status = 73 AND Key > 5123456 LIMIT 1

Time: 5 ms

QP: SEARCH TABLE Table1 USING INTEGER PRIMARY KEY (rowid>?) (~196200 rows)

SELECT COUNT(*) FROM Table2

Time: 529 sec

QP: SCAN TABLE Table2 USING COVERING INDEX sqlite_autoindex_Table2_1(~51152542 rows)

SELECT COUNT(Key) FROM Table2

Time: 249 sec

QP: SCAN TABLE Table2 (~51152542 rows)

As you can see, the queries took the same time (except the query plan is now showing the real number of rows), only the slower select is now also fast.

Next, I create dan extra index on the Key field of Table1, which should correspond to the auto-index. I did this on the original database, without the ANALYZE data. It took over 23 minutes to create this index (remember, this is on an USB-stick).

CREATE INDEX IDX_Table1_Key ON Table1 (Key ASC)

Then I ran the tests again:

SELECT COUNT(*) FROM Table1

Time: 4 sec

QP: SCAN TABLE Table1 USING COVERING INDEX IDX_Table1_Key(~1000000 rows)

SELECT COUNT(Key) FROM Table1

Time: 167 sec

QP: SCAN TABLE Table2 (~1000000 rows)

SELECT * FROM Table1 WHERE Status = 73 AND Key > 5123456 LIMIT 1

Time: 17 sec

QP: SEARCH TABLE Table1 USING INDEX IDX_Table1_Status (Status=?) (~3 rows)

As you can see, the index helped with the count(*), but not with the count(Key).

Finaly, I created the table using a column constraint instead of a table constraint:

CREATE TABLE Table1 (

Key INTEGER PRIMARY KEY ASC NOT NULL,

... several other fields ...,

Status CHAR(1) NOT NULL,

Selection VARCHAR NULL)

Then I ran the tests again:

SELECT COUNT(*) FROM Table1

Time: 6 sec

QP: SCAN TABLE Table1 USING COVERING INDEX IDX_Table1_Selection(~1000000 rows)

SELECT COUNT(Key) FROM Table1

Time: 28 sec

QP: SCAN TABLE Table1 (~1000000 rows)

SELECT * FROM Table1 WHERE Status = 73 AND Key > 5123456 LIMIT 1

Time: 10 sec

QP: SEARCH TABLE Table1 USING INDEX IDX_Table1_Status (Status=?) (~3 rows)

Although the query plans are the same, the times are a lot better. Why is this ?

The problem is that ALTER TABLE does not permit to convert an existing table and I have a lot of existing databases which i can not convert to this form. Besides, using a column contraint instead of table constraint won't work for Table2.

Has anyone any idea what I am doing wrong and how to solve this problem ?

I used System.Data.SQLite version 1.0.74.0 to create the tables and to run the tests I used SQLiteSpy 1.9.1.

Thanks,

Marc

解决方案

If you haven't DELETEd any records, doing:

SELECT MAX(_ROWID_) FROM "table" LIMIT 1;

Will avoid the full-table scan. Note that _ROWID_ is a SQLite identifier.