mysql字符串提取总结

一.在MySQL中删除字符串或字段中的空格

  1. 删除字符串开头和末尾的空格:

    SELECT TRIM('  your text  ');
  2. 只删除字符串开头的空格:

    SELECT LTRIM('  your text  ');
  3. 只删除字符串末尾的空格:

    SELECT RTRIM('  your text  ');
  4. 删除字段中所有空格:

    UPDATE your_table
    SET your_column = REPLACE(your_column, ' ', '');
  5. 如果要删除字段中单词之间的多个空格,保留单词内的空格,可以使用正则表达式:

    UPDATE your_table
    SET your_column = TRIM(BOTH ' ' FROM REPLACE(your_column, '  ', ' '));

    二按指定字符截取

  6. 1、使用SUBSTRING函数进行截取
    语法:str是要截取的字符串,pos是起始位置,len是要截取的长度

    SUBSTRING(str, pos, len)
    
    -- 例子
    
    SELECT SUBSTRING('Hello, World!', 1, 5);
    
    --  返回"Hello"。其中,起始位置为1,截取的长度为5

    2、使用REGEXP_SUBSTR函数进行截取
    语法:str是要截取的字符串,regexp是正则表达式

    REGEXP_SUBSTR(str, regexp)
    
    -- 例子
    
    SELECT REGEXP_SUBSTR('Hello, World!', '[A-Z][a-z]+');
    
    -- 返回"Hello"。其中,正则表达式"[A-Z][a-z]+"表示匹配一个以大写字母开头,后面跟着一个或多个小写字母的子字符串

    3、使用LEFT和RIGHT函数进行截取
    语法:LEFT函数返回字符串的左边指定长度的字符,RIGHT函数返回字符串的右边指定长度的字符

    LEFT(str,len)
    RIGHT(str,len)
    -- 例子
    
    SELECT LEFT('Hello, World!', 5);
    SELECT RIGHT('Hello, World!', 6);
    
    -- 分别返回"Hello"和"World!"

    4、使用REPLACE函数,用于替换字符串的函数,也可以用来进行截取操作
    语法:str是要替换的字符串,from_str是要替换的子字符串,to_str是替换后的子字符串

    REPLACE(str, from_str, to_str)
    
    -- 例子
    
    SELECT REPLACE('Hello, World!', 'Hello, ', '');
    
    -- 返回"World!"。其中,将"Hello, "替换为空字符串,实现了字符串的截取

    5、使用SUBSTRING_INDEX函数进行截取,可以根据指定的分隔符和位置来截取字符串
    语法:str是要截取的字符串,delim是分隔符,count是要截取的子字符串的位置

    SUBSTRING_INDEX(str, delim, count)
    
    -- 例子
    
    SELECT SUBSTRING_INDEX('www.example.com', '.', 2);
    
    -- 返回字符串"www.example"。其中,使用"."作为分隔符,2返回前两个分割的数据

三.需求将下面html获取一些营养数据

<!doctype html><html><head><meta charset="utf-8"><meta name="viewport" content="width=device-width,initial-scale=1,user-scalable=0"><title>食物营养查询平台</title><link rel="stylesheet" href="../frontcss/weui.min.css"><link rel="stylesheet" href="../frontcss/frame.css"><link rel="stylesheet" href="../frontcss/public.css"><link rel="stylesheet" href="../frontcss/tabnav.css"><script src="../script/button.js" type="text/javascript"></script><script src="../script/jquery.cookie.js" type="text/javascript" ></script><script src="../script/include.js" type="text/javascript" ></script></head><body><div class="details public_width public_bottom"><!--start-->  <div class="details_top">  </div>     <div class="details_table">      <div class="title">能量与相关成分</div>      <table width="100%" cellpadding="0" cellspacing="0">        <tbody>          <tr>            <th scope="col">项目</th>            <th scope="col">含量</th>            <th scope="col">同类排名</th>            <th scope="col">同类均值</th>          </tr>              <tr>                <td>食部</td>                <td>100%</td>                <td>1</td>                <td>100%</td>                <td style="display: none">100.0%</td>              </tr>              <tr>                <td>水分</td>                <td>12.7g</td>                <td>14</td>                <td>26.6g</td>                <td style="display: none">48.0%</td>              </tr>              <tr>                <td>能量</td>                <td>1497KJ</td>                <td>5</td>                <td>1310KJ</td>                <td style="display: none">114.0%</td>              </tr>              <tr>                <td>蛋白质</td>                <td>11.2g</td>                <td>4</td>                <td>10.3g</td>                <td style="display: none">108.0%</td>              </tr>              <tr>                <td>脂肪</td>                <td>1.5g</td>                <td>7</td>                <td>3.9g</td>                <td style="display: none">38.0%</td>              </tr>              <tr>                <td>胆固醇</td>                <td></td>                <td></td>                <td></td>                <td style="display: none"></td>              </tr>              <tr>                <td>灰分</td>                <td>1.0g</td>                <td>8</td>                <td>1.0g</td>                <td style="display: none">100.0%</td>              </tr>              <tr>                <td>碳水化合物(总)</td>                <td>73.6g</td>                <td>7</td>                <td>58.1g</td>                <td style="display: none">127.0%</td>              </tr>              <tr>                <td>总膳食纤维</td>                <td></td>                <td></td>                <td></td>                <td style="display: none"></td>              </tr>        </tbody>      </table>    </div>     <div class="details_table">      <div class="title">维生素</div>      <table width="100%" cellpadding="0" cellspacing="0">        <tbody>          <tr>            <th scope="col">项目</th>            <th scope="col">含量</th>            <th scope="col">同类排名</th>            <th scope="col">同类均值</th>          </tr>              <tr>                <td>胡萝卜素</td>                <td></td>                <td></td>                <td></td>                <td style="display: none"></td>              </tr>              <tr>                <td>维生素A</td>                <td></td>                <td></td>                <td></td>                <td style="display: none"></td>              </tr>              <tr>                <td>α-TE</td>                <td>1.60mg</td>                <td>4</td>                <td>1.25mg</td>                <td style="display: none">128.0%</td>              </tr>              <tr>                <td>硫胺素</td>                <td>0.28mg</td>                <td>2</td>                <td>0.13mg</td>                <td style="display: none">215.0%</td>              </tr>              <tr>                <td>核黄素</td>                <td>0.08mg</td>                <td>2</td>                <td>0.05mg</td>                <td style="display: none">160.0%</td>              </tr>              <tr>                <td>烟酸</td>                <td>2.00mg</td>                <td>8</td>                <td>1.33mg</td>                <td style="display: none">150.0%</td>              </tr>              <tr>                <td>维生素C</td>                <td></td>                <td></td>                <td></td>                <td style="display: none"></td>              </tr>        </tbody>      </table>    </div>     <div class="details_table">      <div class="title">矿物质</div>      <table width="100%" cellpadding="0" cellspacing="0">        <tbody>          <tr>            <th scope="col">项目</th>            <th scope="col">含量</th>            <th scope="col">同类排名</th>            <th scope="col">同类均值</th>          </tr>              <tr>                <td>钙</td>                <td>31mg</td>                <td>6</td>                <td>26mg</td>                <td style="display: none">120.0%</td>              </tr>              <tr>                <td>磷</td>                <td>188mg</td>                <td>1</td>                <td>121mg</td>                <td style="display: none">155.0%</td>              </tr>              <tr>                <td>钾</td>                <td>190mg</td>                <td>3</td>                <td>132mg</td>                <td style="display: none">144.0%</td>              </tr>              <tr>                <td>钠</td>                <td>3.1mg</td>                <td>19</td>                <td>134.6mg</td>                <td style="display: none">2.0%</td>              </tr>              <tr>                <td>镁</td>                <td>50mg</td>                <td>5</td>                <td>36mg</td>                <td style="display: none">137.0%</td>              </tr>              <tr>                <td>铁</td>                <td>3.5mg</td>                <td>5</td>                <td>3.0mg</td>                <td style="display: none">119.0%</td>              </tr>              <tr>                <td>锌</td>                <td>1.64mg</td>                <td>4</td>                <td>1.09mg</td>                <td style="display: none">150.0%</td>              </tr>              <tr>                <td>硒</td>                <td>5.40μg</td>                <td>19</td>                <td>9.83μg</td>                <td style="display: none">55.0%</td>              </tr>              <tr>                <td>铜</td>                <td>0.42mg</td>                <td>3</td>                <td>0.23mg</td>                <td style="display: none">183.0%</td>              </tr>              <tr>                <td>锰</td>                <td>1.56mg</td>                <td>1</td>                <td>0.81mg</td>                <td style="display: none">193.0%</td>              </tr>              <tr>                <td>碘</td>                <td></td>                <td></td>                <td></td>                <td style="display: none"></td>              </tr>        </tbody>      </table>    </div>     <div class="details_table">      <div class="title">脂肪酸</div>      <table width="100%" cellpadding="0" cellspacing="0">        <tbody>          <tr>            <th scope="col">项目</th>            <th scope="col">含量</th>            <th scope="col">同类排名</th>            <th scope="col">同类均值</th>          </tr>              <tr>                <td>饱和脂肪酸</td>                <td>30.300g</td>                <td>1</td>                <td>7.290g</td>                <td style="display: none">416.0%</td>              </tr>              <tr>                <td>单不饱和脂肪酸</td>                <td>24.100g</td>                <td>2</td>                <td>8.500g</td>                <td style="display: none">284.0%</td>              </tr>              <tr>                <td>多不饱和脂肪酸</td>                <td>44.800g</td>                <td>2</td>                <td>12.010g</td>                <td style="display: none">373.0%</td>              </tr>              <tr>                <td>总脂肪酸</td>                <td>99.200g</td>                <td>2</td>                <td>27.800g</td>                <td style="display: none">357.0%</td>              </tr>        </tbody>      </table>    </div>     <div class="details_table">      <div class="title">氨基酸</div>      <table width="100%" cellpadding="0" cellspacing="0">        <tbody>          <tr>            <th scope="col">项目</th>            <th scope="col">含量</th>            <th scope="col">同类排名</th>            <th scope="col">同类均值</th>          </tr>        </tbody>      </table>    </div>  </body></html>
-- 先把html的空格去掉
UPDATE y_foods
SET contenthtml = REPLACE(contenthtml, ' ', '');

-- #观察html内容,1.提取关键字的开始位置作为一个临时,2.截取相应长度的值作为一个临时,3.截取到的数据更新到表中,4.在进行相应清洗

--  <td>能量</td><td>     <td>蛋白质</td><td>    <td>脂肪</td><td>    <td>碳水化合物(总)</td><td>  <td>总膳食纤维</td><td>
--   <td>钙</td><td>   <td>磷</td><td>  <td>钾</td><td>  <td>钠</td><td>  <td>镁</td><td>  <td>铁</td><td>  <td>锌</td><td>  <td>硒</td><td>  <td>铜</td><td>  <td>锰</td><td>


UPDATE y_foods tfood
LEFT JOIN (
SELECT
	foods.id,
	SUBSTRING( contenthtml, viewindex.nl, 6 ) AS nl,
	SUBSTRING( contenthtml, viewindex.dbz, 6 ) AS dbz,
	SUBSTRING( contenthtml, viewindex.zf, 6 ) AS zf,
	SUBSTRING( contenthtml, viewindex.cshhw, 6 ) AS cshhw,
	SUBSTRING( contenthtml, viewindex.ssxw, 6 ) AS ssxw,
	SUBSTRING( contenthtml, viewindex.ca, 6 ) AS ca,
	SUBSTRING( contenthtml, viewindex.lin, 6 ) AS lin,
	SUBSTRING( contenthtml, viewindex.k, 6 ) AS k,
	SUBSTRING( contenthtml, viewindex.na, 6 ) AS na,
	SUBSTRING( contenthtml, viewindex.mg, 6 ) AS mg,
	SUBSTRING( contenthtml, viewindex.fe, 6 ) AS fe,
	SUBSTRING( contenthtml, viewindex.zn, 6 ) AS zn,
	SUBSTRING( contenthtml, viewindex.xi, 6 ) AS xi,
	SUBSTRING( contenthtml, viewindex.cu, 6 ) AS cu,
	SUBSTRING( contenthtml, viewindex.men, 6 ) AS men 
FROM
	y_foods foods
	LEFT JOIN (
SELECT
	id,
	LOCATE( '<td>能量</td><td>', contenthtml ) + 15 AS nl,
	LOCATE( '<td>蛋白质</td><td>', contenthtml ) + 16 AS dbz,
	LOCATE( '<td>脂肪</td><td>', contenthtml ) + 15 AS zf,
	LOCATE( '<td>碳水化合物(总)</td><td>', contenthtml ) + 21 AS cshhw,
	LOCATE( '<td>总膳食纤维</td><td>', contenthtml ) + 18 AS ssxw,
	LOCATE( '<td>钙</td><td>', contenthtml ) + 14 AS ca,
	LOCATE( '<td>磷</td><td>', contenthtml ) + 14 AS lin,
	LOCATE( '<td>钾</td><td>', contenthtml ) + 14 AS k,
	LOCATE( '<td>钠</td><td>', contenthtml ) + 14 AS na,
	LOCATE( '<td>镁</td><td>', contenthtml ) + 14 AS mg,
	LOCATE( '<td>铁</td><td>', contenthtml ) + 14 AS fe,
	LOCATE( '<td>锌</td><td>', contenthtml ) + 14 AS zn,
	LOCATE( '<td>硒</td><td>', contenthtml ) + 14 AS xi,
	LOCATE( '<td>铜</td><td>', contenthtml ) + 14 AS cu,
	LOCATE( '<td>锰</td><td>', contenthtml ) + 14 AS men 
FROM
	y_foods 
	) viewindex ON foods.id = viewindex.id 
	) viewvalue ON tfood.id = viewvalue.id 
	SET tfood.nl = viewvalue.nl,
	tfood.dbz = viewvalue.dbz,
	tfood.zf = viewvalue.zf,
	tfood.cshhw = viewvalue.cshhw,
	tfood.ssxw = viewvalue.ssxw,
	tfood.ca = viewvalue.ca,
	tfood.lin = viewvalue.lin,
	tfood.k = viewvalue.k,
	tfood.na = viewvalue.na,
	tfood.mg = viewvalue.mg,
	tfood.fe = viewvalue.fe,
	tfood.zn = viewvalue.zn,
	tfood.xi = viewvalue.xi,
	tfood.cu = viewvalue.cu,
	tfood.men = viewvalue.men;

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

软件编程工程师

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值