正则表达式的使用
正则表达式是指一个用来描述或者匹配一些列符合某个句法规则的字符串的单个字符串。
MySQL中利用REGEXP命令提供给用户扩展的正则表达式功能
正则表达式中的模式
序列 | 序列说明 |
^ | 在字符串的开始处进行匹配 |
$ | 在字符串的末尾出进行匹配 |
. | 匹配任意单个字符,包括换行符 |
[...] | 匹配出括号内的任意字符 |
[^...] | 匹配不出括号内的任意字符 |
a* | 匹配零个或者多个a(包括空串) |
a+ | 匹配1个或者多个a(不包括空串) |
a? | 匹配1个或者0个a |
a1|a2 | 匹配a1或者a2 |
a(m) | 匹配m个a |
a(m,) | 匹配m个或者更多个a |
a(m,n) | 匹配m到n个a |
a(,n) | 匹配0到n个a |
(.....) | 将元素组成单一元素 |
例子:
原表:
mysql> select * from t77;
+------------------+
|name |
+------------------+
|asdfgh |
|aaggsdfasf |
|sdfsadsfassfsf |
|asdfasdfasdfasdf |
|NULL |
|NULL |
+------------------+
6 rowsin set (0.00 sec)
在字符串的开始处进行匹配
mysql>select * from t77 where name regexp'^as';
+------------------+
| name |
+------------------+
| asdfgh |
| asdfasdfasdfasdf |
+------------------+
2 rows in set (0.00 sec)
在字符串的末尾出进行匹配
mysql>select * from t77 where name regexp'sf$';
+----------------+
| name |
+----------------+
| aaggsdfasf |
| sdfsadsfassfsf |
+----------------+
2 rows in set (0.00 sec)
匹配任意单个字符,包括换行符
mysql>select * from t77 where name regexp'.a';
+------------------+
| name |
+------------------+
| aaggsdfasf |
| sdfsadsfassfsf |
| asdfasdfasdfasdf |
+------------------+
3 rows in set (0.00 sec)
mysql>select * from t77 where name regexp 'a';
+------------------+
| name |
+------------------+
| asdfgh |
| aaggsdfasf |
| sdfsadsfassfsf |
| asdfasdfasdfasdf |
+------------------+
4 rows in set (0.00 sec)
可以看出,如果使用.a则a前面必须有至少一个字符才会被匹配。
匹配出括号内的任意字符
mysql>select * from t77 where name regexp'gop';
Empty set (0.00 sec)
mysql>select * from t77 where name regexp'[gop]';
+------------+
| name |
+------------+
| asdfgh |
| aaggsdfasf |
+------------+
2 rows in set (0.00 sec)
可以看出,只要包含[]里的任意一个字符即可被筛选出来。
匹配不出括号内的任意字符
mysql>select * from t77 where name regexp'[^asdfh]';
+------------+
| name |
+------------+
| asdfgh |
| aaggsdfasf |
+------------+
2 rows in set (0.00 sec)
mysql>select * from t77 where name regexp'[^asdfg]';
+--------+
| name |
+--------+
| asdfgh |
+--------+
1 row in set (0.00 sec)
由此可以看出,利用这个正则表达式,能够筛选不出括号的任意字符,与[....]相反,也就是说条件的元素与筛选内容相交,筛选内容必有多余元素,也就是说筛选出来的内容必然不是条件的子集。
匹配零个或者多个a(包括空串)
mysql>select * from t77 where name regexp 'gg*';
+------------+
| name |
+------------+
| asdfgh |
| aaggsdfasf |
+------------+
2 rows in set (0.00 sec)
匹配1个或者多个a(不包括空串)
mysql>select * from t77 where name regexp 'gg+';
+------------+
| name |
+------------+
| aaggsdfasf |
+------------+
1 row in set (0.00 sec)
匹配1个或者0个a
mysql>select * from t77 where name regexp 'gg?';
+------------+
| name |
+------------+
| asdfgh |
| aaggsdfasf |
+------------+
2 rows in set (0.00 sec)
匹配a1或者a2
mysql>select * from t77 where name regexp 'ss|gh';
+----------------+
| name |
+----------------+
| asdfgh |
| sdfsadsfassfsf |
+----------------+
2 rows in set (0.00 sec)
巧用RAND()提取随机行
mysql>select * from t9 order by rand() limit 3;
+---------+----------------------------------+
| id | name |
+---------+----------------------------------+
| 1750322 |tcpdump:x:72:72::/:/sbin/nologin |
| 1287049 | bageyalu |
| 1301398 |bin:x:1:1:bin:/bin:/sbin/nologin |
+---------+----------------------------------+
通过rand()来获取随机行
利用GROUP BY的WITH ROLLUP子句
mysql>select date_format(payment_date,'%Y-%m'),staff_id,sum(amount) from paymentgroup by date_format(payment_date,'%Y-%m'),staff_id;
+-----------------------------------+----------+-------------+
| date_format(payment_date,'%Y-%m') |staff_id | sum(amount) |
+-----------------------------------+----------+-------------+
| 2005-05 | 1 | 2621.83 |
| 2005-05 | 2 | 2202.60 |
| 2005-06 | 1 | 4776.36 |
| 2005-06 | 2 | 4855.52 |
| 2005-07 | 1 | 14003.54 |
| 2005-07 | 2 | 14370.35 |
| 2005-08 | 1 | 11853.65 |
| 2005-08 | 2 | 12218.48 |
| 2006-02 | 1 | 234.09 |
| 2006-02 | 2 | 280.09 |
+-----------------------------------+----------+-------------+
10 rows in set (0.08 sec)
mysql>select date_format(payment_date,'%Y-%m'),staff_id,sum(amount) from paymentgroup by date_format(payment_date,'%Y-%m'),staff_id with rollup;
+-----------------------------------+----------+-------------+
| date_format(payment_date,'%Y-%m') |staff_id | sum(amount) |
+-----------------------------------+----------+-------------+
| 2005-05 | 1 | 2621.83 |
| 2005-05 | 2 | 2202.60 |
|2005-05 | NULL | 4824.43 |
| 2005-06 | 1 | 4776.36 |
| 2005-06 | 2 | 4855.52 |
|2005-06 | NULL | 9631.88 |
| 2005-07 | 1 | 14003.54 |
| 2005-07 | 2 | 14370.35 |
|2005-07 | NULL | 28373.89 |
| 2005-08 | 1 | 11853.65 |
| 2005-08 | 2 | 12218.48 |
|2005-08 | NULL | 24072.13 |
| 2006-02 | 1 | 234.09 |
| 2006-02 | 2 | 280.09 |
|2006-02 | NULL | 514.18 |
|NULL | NULL| 67416.51 |
+-----------------------------------+----------+-------------+
16 rows in set (0.06 sec)
通过上面例子可以看出来,下面的查询在每年都计算出来全年的总支持金额,并且在最后也算出了总的支出金额,。其实with rollup反映的是一种OLAP思想,也就是说这个GROUP BY语句执行完成后可以满足用户想要得到的任何一个分组及分组组合的聚合信息值。但是rollup和order by是互斥的,并且limit应该用在rollup之后。
利用BIT GROUP FUNCTIONS做统计
BIT_AND()
按位与,有1则为1
mysql>create table order_rab(id int,customer_id int,kind int);
Query OK, 0 rows affected (0.02 sec)
mysql>insert into order_rab values(1,1,5),(2,1,4);
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql>insert into order_rab values(3,2,3),(4,2,4);
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql>select * from rab;
ERROR 1146 (42S02): Table 'test2.rab'doesn't exist
mysql>select * from order_rab;
+------+-------------+------+
| id | customer_id | kind |
+------+-------------+------+
| 1 | 1 | 5 |
| 2 | 1 | 4 |
| 3 | 2 | 3 |
| 4 | 2 | 4 |
+------+-------------+------+
4 rows in set (0.00 sec)
mysql>select customer_id,bit_or(kind) from order_rab group by customer_id;
+-------------+--------------+
| customer_id | bit_or(kind) |
+-------------+--------------+
| 1 | 5 |
| 2 | 7 |
+-------------+--------------+
2 rows in set (0.00 sec)
上面例子,是5和4做或运算=(0101|0100)=0101=5
3和4做或运算=(0011|0100)=0111=7
BIT_OR()
按位与,同为1则为1
mysql>select customer_id,bit_and(kind) from order_rab group by customer_id;
+-------------+---------------+
| customer_id | bit_and(kind) |
+-------------+---------------+
| 1 | 4 |
| 2 | 0 |
+-------------+---------------+
2 rows in set (0.00 sec)
上面例子,是5和4做与运算=(0101&0100)=0100=4
3和4做或运算=(0011&0100)=0000=0
使用外键要注意的问题
由于MyISAM表不支持外键,所以在使用外键的时候一定不要使用不支持外键的表,并且可以通过show create table来检查是否可以使用外键。