一.在MySQL中删除字符串或字段中的空格
-
删除字符串开头和末尾的空格:
SELECT TRIM(' your text ');
-
只删除字符串开头的空格:
SELECT LTRIM(' your text ');
-
只删除字符串末尾的空格:
SELECT RTRIM(' your text ');
-
删除字段中所有空格:
UPDATE your_table SET your_column = REPLACE(your_column, ' ', '');
-
如果要删除字段中单词之间的多个空格,保留单词内的空格,可以使用正则表达式:
UPDATE your_table SET your_column = TRIM(BOTH ' ' FROM REPLACE(your_column, ' ', ' '));
二按指定字符截取
-
1、使用SUBSTRING函数进行截取
语法:str是要截取的字符串,pos是起始位置,len是要截取的长度SUBSTRING(str, pos, len) -- 例子 SELECT SUBSTRING('Hello, World!', 1, 5); -- 返回"Hello"。其中,起始位置为1,截取的长度为5
2、使用REGEXP_SUBSTR函数进行截取
语法:str是要截取的字符串,regexp是正则表达式REGEXP_SUBSTR(str, regexp) -- 例子 SELECT REGEXP_SUBSTR('Hello, World!', '[A-Z][a-z]+'); -- 返回"Hello"。其中,正则表达式"[A-Z][a-z]+"表示匹配一个以大写字母开头,后面跟着一个或多个小写字母的子字符串
3、使用LEFT和RIGHT函数进行截取
语法:LEFT函数返回字符串的左边指定长度的字符,RIGHT函数返回字符串的右边指定长度的字符LEFT(str,len) RIGHT(str,len) -- 例子 SELECT LEFT('Hello, World!', 5); SELECT RIGHT('Hello, World!', 6); -- 分别返回"Hello"和"World!"
4、使用REPLACE函数,用于替换字符串的函数,也可以用来进行截取操作
语法:str是要替换的字符串,from_str是要替换的子字符串,to_str是替换后的子字符串REPLACE(str, from_str, to_str) -- 例子 SELECT REPLACE('Hello, World!', 'Hello, ', ''); -- 返回"World!"。其中,将"Hello, "替换为空字符串,实现了字符串的截取
5、使用SUBSTRING_INDEX函数进行截取,可以根据指定的分隔符和位置来截取字符串
语法:str是要截取的字符串,delim是分隔符,count是要截取的子字符串的位置SUBSTRING_INDEX(str, delim, count) -- 例子 SELECT SUBSTRING_INDEX('www.example.com', '.', 2); -- 返回字符串"www.example"。其中,使用"."作为分隔符,2返回前两个分割的数据
三.需求将下面html获取一些营养数据
<!doctype html><html><head><meta charset="utf-8"><meta name="viewport" content="width=device-width,initial-scale=1,user-scalable=0"><title>食物营养查询平台</title><link rel="stylesheet" href="../frontcss/weui.min.css"><link rel="stylesheet" href="../frontcss/frame.css"><link rel="stylesheet" href="../frontcss/public.css"><link rel="stylesheet" href="../frontcss/tabnav.css"><script src="../script/button.js" type="text/javascript"></script><script src="../script/jquery.cookie.js" type="text/javascript" ></script><script src="../script/include.js" type="text/javascript" ></script></head><body><div class="details public_width public_bottom"><!--start--> <div class="details_top"> </div> <div class="details_table"> <div class="title">能量与相关成分</div> <table width="100%" cellpadding="0" cellspacing="0"> <tbody> <tr> <th scope="col">项目</th> <th scope="col">含量</th> <th scope="col">同类排名</th> <th scope="col">同类均值</th> </tr> <tr> <td>食部</td> <td>100%</td> <td>1</td> <td>100%</td> <td style="display: none">100.0%</td> </tr> <tr> <td>水分</td> <td>12.7g</td> <td>14</td> <td>26.6g</td> <td style="display: none">48.0%</td> </tr> <tr> <td>能量</td> <td>1497KJ</td> <td>5</td> <td>1310KJ</td> <td style="display: none">114.0%</td> </tr> <tr> <td>蛋白质</td> <td>11.2g</td> <td>4</td> <td>10.3g</td> <td style="display: none">108.0%</td> </tr> <tr> <td>脂肪</td> <td>1.5g</td> <td>7</td> <td>3.9g</td> <td style="display: none">38.0%</td> </tr> <tr> <td>胆固醇</td> <td></td> <td></td> <td></td> <td style="display: none"></td> </tr> <tr> <td>灰分</td> <td>1.0g</td> <td>8</td> <td>1.0g</td> <td style="display: none">100.0%</td> </tr> <tr> <td>碳水化合物(总)</td> <td>73.6g</td> <td>7</td> <td>58.1g</td> <td style="display: none">127.0%</td> </tr> <tr> <td>总膳食纤维</td> <td></td> <td></td> <td></td> <td style="display: none"></td> </tr> </tbody> </table> </div> <div class="details_table"> <div class="title">维生素</div> <table width="100%" cellpadding="0" cellspacing="0"> <tbody> <tr> <th scope="col">项目</th> <th scope="col">含量</th> <th scope="col">同类排名</th> <th scope="col">同类均值</th> </tr> <tr> <td>胡萝卜素</td> <td></td> <td></td> <td></td> <td style="display: none"></td> </tr> <tr> <td>维生素A</td> <td></td> <td></td> <td></td> <td style="display: none"></td> </tr> <tr> <td>α-TE</td> <td>1.60mg</td> <td>4</td> <td>1.25mg</td> <td style="display: none">128.0%</td> </tr> <tr> <td>硫胺素</td> <td>0.28mg</td> <td>2</td> <td>0.13mg</td> <td style="display: none">215.0%</td> </tr> <tr> <td>核黄素</td> <td>0.08mg</td> <td>2</td> <td>0.05mg</td> <td style="display: none">160.0%</td> </tr> <tr> <td>烟酸</td> <td>2.00mg</td> <td>8</td> <td>1.33mg</td> <td style="display: none">150.0%</td> </tr> <tr> <td>维生素C</td> <td></td> <td></td> <td></td> <td style="display: none"></td> </tr> </tbody> </table> </div> <div class="details_table"> <div class="title">矿物质</div> <table width="100%" cellpadding="0" cellspacing="0"> <tbody> <tr> <th scope="col">项目</th> <th scope="col">含量</th> <th scope="col">同类排名</th> <th scope="col">同类均值</th> </tr> <tr> <td>钙</td> <td>31mg</td> <td>6</td> <td>26mg</td> <td style="display: none">120.0%</td> </tr> <tr> <td>磷</td> <td>188mg</td> <td>1</td> <td>121mg</td> <td style="display: none">155.0%</td> </tr> <tr> <td>钾</td> <td>190mg</td> <td>3</td> <td>132mg</td> <td style="display: none">144.0%</td> </tr> <tr> <td>钠</td> <td>3.1mg</td> <td>19</td> <td>134.6mg</td> <td style="display: none">2.0%</td> </tr> <tr> <td>镁</td> <td>50mg</td> <td>5</td> <td>36mg</td> <td style="display: none">137.0%</td> </tr> <tr> <td>铁</td> <td>3.5mg</td> <td>5</td> <td>3.0mg</td> <td style="display: none">119.0%</td> </tr> <tr> <td>锌</td> <td>1.64mg</td> <td>4</td> <td>1.09mg</td> <td style="display: none">150.0%</td> </tr> <tr> <td>硒</td> <td>5.40μg</td> <td>19</td> <td>9.83μg</td> <td style="display: none">55.0%</td> </tr> <tr> <td>铜</td> <td>0.42mg</td> <td>3</td> <td>0.23mg</td> <td style="display: none">183.0%</td> </tr> <tr> <td>锰</td> <td>1.56mg</td> <td>1</td> <td>0.81mg</td> <td style="display: none">193.0%</td> </tr> <tr> <td>碘</td> <td></td> <td></td> <td></td> <td style="display: none"></td> </tr> </tbody> </table> </div> <div class="details_table"> <div class="title">脂肪酸</div> <table width="100%" cellpadding="0" cellspacing="0"> <tbody> <tr> <th scope="col">项目</th> <th scope="col">含量</th> <th scope="col">同类排名</th> <th scope="col">同类均值</th> </tr> <tr> <td>饱和脂肪酸</td> <td>30.300g</td> <td>1</td> <td>7.290g</td> <td style="display: none">416.0%</td> </tr> <tr> <td>单不饱和脂肪酸</td> <td>24.100g</td> <td>2</td> <td>8.500g</td> <td style="display: none">284.0%</td> </tr> <tr> <td>多不饱和脂肪酸</td> <td>44.800g</td> <td>2</td> <td>12.010g</td> <td style="display: none">373.0%</td> </tr> <tr> <td>总脂肪酸</td> <td>99.200g</td> <td>2</td> <td>27.800g</td> <td style="display: none">357.0%</td> </tr> </tbody> </table> </div> <div class="details_table"> <div class="title">氨基酸</div> <table width="100%" cellpadding="0" cellspacing="0"> <tbody> <tr> <th scope="col">项目</th> <th scope="col">含量</th> <th scope="col">同类排名</th> <th scope="col">同类均值</th> </tr> </tbody> </table> </div> </body></html>
-- 先把html的空格去掉
UPDATE y_foods
SET contenthtml = REPLACE(contenthtml, ' ', '');
-- #观察html内容,1.提取关键字的开始位置作为一个临时,2.截取相应长度的值作为一个临时,3.截取到的数据更新到表中,4.在进行相应清洗
-- <td>能量</td><td> <td>蛋白质</td><td> <td>脂肪</td><td> <td>碳水化合物(总)</td><td> <td>总膳食纤维</td><td>
-- <td>钙</td><td> <td>磷</td><td> <td>钾</td><td> <td>钠</td><td> <td>镁</td><td> <td>铁</td><td> <td>锌</td><td> <td>硒</td><td> <td>铜</td><td> <td>锰</td><td>
UPDATE y_foods tfood
LEFT JOIN (
SELECT
foods.id,
SUBSTRING( contenthtml, viewindex.nl, 6 ) AS nl,
SUBSTRING( contenthtml, viewindex.dbz, 6 ) AS dbz,
SUBSTRING( contenthtml, viewindex.zf, 6 ) AS zf,
SUBSTRING( contenthtml, viewindex.cshhw, 6 ) AS cshhw,
SUBSTRING( contenthtml, viewindex.ssxw, 6 ) AS ssxw,
SUBSTRING( contenthtml, viewindex.ca, 6 ) AS ca,
SUBSTRING( contenthtml, viewindex.lin, 6 ) AS lin,
SUBSTRING( contenthtml, viewindex.k, 6 ) AS k,
SUBSTRING( contenthtml, viewindex.na, 6 ) AS na,
SUBSTRING( contenthtml, viewindex.mg, 6 ) AS mg,
SUBSTRING( contenthtml, viewindex.fe, 6 ) AS fe,
SUBSTRING( contenthtml, viewindex.zn, 6 ) AS zn,
SUBSTRING( contenthtml, viewindex.xi, 6 ) AS xi,
SUBSTRING( contenthtml, viewindex.cu, 6 ) AS cu,
SUBSTRING( contenthtml, viewindex.men, 6 ) AS men
FROM
y_foods foods
LEFT JOIN (
SELECT
id,
LOCATE( '<td>能量</td><td>', contenthtml ) + 15 AS nl,
LOCATE( '<td>蛋白质</td><td>', contenthtml ) + 16 AS dbz,
LOCATE( '<td>脂肪</td><td>', contenthtml ) + 15 AS zf,
LOCATE( '<td>碳水化合物(总)</td><td>', contenthtml ) + 21 AS cshhw,
LOCATE( '<td>总膳食纤维</td><td>', contenthtml ) + 18 AS ssxw,
LOCATE( '<td>钙</td><td>', contenthtml ) + 14 AS ca,
LOCATE( '<td>磷</td><td>', contenthtml ) + 14 AS lin,
LOCATE( '<td>钾</td><td>', contenthtml ) + 14 AS k,
LOCATE( '<td>钠</td><td>', contenthtml ) + 14 AS na,
LOCATE( '<td>镁</td><td>', contenthtml ) + 14 AS mg,
LOCATE( '<td>铁</td><td>', contenthtml ) + 14 AS fe,
LOCATE( '<td>锌</td><td>', contenthtml ) + 14 AS zn,
LOCATE( '<td>硒</td><td>', contenthtml ) + 14 AS xi,
LOCATE( '<td>铜</td><td>', contenthtml ) + 14 AS cu,
LOCATE( '<td>锰</td><td>', contenthtml ) + 14 AS men
FROM
y_foods
) viewindex ON foods.id = viewindex.id
) viewvalue ON tfood.id = viewvalue.id
SET tfood.nl = viewvalue.nl,
tfood.dbz = viewvalue.dbz,
tfood.zf = viewvalue.zf,
tfood.cshhw = viewvalue.cshhw,
tfood.ssxw = viewvalue.ssxw,
tfood.ca = viewvalue.ca,
tfood.lin = viewvalue.lin,
tfood.k = viewvalue.k,
tfood.na = viewvalue.na,
tfood.mg = viewvalue.mg,
tfood.fe = viewvalue.fe,
tfood.zn = viewvalue.zn,
tfood.xi = viewvalue.xi,
tfood.cu = viewvalue.cu,
tfood.men = viewvalue.men;