mysql分隔字符串转变为多行_【mysql】将以固定分隔符分隔的字符串转成多行的形式...

背景

近期在做用户标签项目,目前标签的存储是用户id,标签ids(每个标签以,形式存储) 的形式,但是如果想统计前后两天标签的变化,使用find_in_set函数,一方面查询速度慢(因为不能使用索引),另一方面目前标签已有将近300多个,以后还会更多,一个标签一个标签的写,使得sql特别长。

目前采取的策略是:将用户标签表拆分成 用户id,标签id的形式。这样将前后两天的表关联,就能查出昨天到今天有那些标签离开,哪些标签进来。

那么采用上述策略就需要研究怎么将用户id,标签ids的形式转换成用户id,标签id的形式。

探索

我们知道像 1,2,3,4,5,212 这种字符串,如果要分别取到1 2 3 4 5 212,用编程的思想就是先将该字符串用","分隔成一个数据,然后遍历取到数组里的每一个值,但是在mysql里并没有数组的概念,但是我们可以用各种方法求得字符串的长度,以及求得使用“,”分隔后有多少个值。也可以用mysql可以采取的字符串截取的形式去获得相应位置的数值。下面就让我们看一下吧~

实现

相关表结构

CREATE TABLE `tagids_label` (

`userid` int(11) NOT NULL COMMENT '用户id',

`label` int(11) NOT NULL COMMENT '标记,暂时 保留三天的数据,day%3 ',

`day` int(11) NOT NULL COMMENT '对应的统计日期的天',

`tagids` text NOT NULL COMMENT '标签id,以,(英文)分隔',

`createTime` datetime NOT NULL COMMENT '创建时间',

`updateTime` datetime NOT NULL COMMENT '更新时间',

PRIMARY KEY (`userid`,`label`),

KEY `index_day` (`day`),

KEY `index_label` (`label`),

KEY `index_label_userid` (`userid`,`label`),

KEY `index_createTime_userid` (`userid`,`createTime`),

KEY `index_userid` (`userid`),

KEY `index_createtime` (`createTime`) USING BTREE,

FULLTEXT KEY `index_tagids` (`tagids`)

) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='用户标签结果表'

CREATE TABLE `sequence` (

`seq` int(3) DEFAULT NULL

) ENGINE=InnoDB DEFAULT CHARSET=utf8

ps:sequence表表示标签的个数,从1到最大个数

获得固定分隔符分隔后元素个数

原数据

SELECT * FROM `tagids_label` WHERE `userid` =2

f9e9c586fe06

blockchain

171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227

分隔符分隔后元素个数

SELECT length(`tagids`) ,length(REPLACE (`tagids`,',','')),length(`tagids`)-length(REPLACE (`tagids`,',',''))+1 FROM `tagids_label` WHERE `userid` =2

f9e9c586fe06

image.png

注:length(tagids)计算字符串长度,以字节为单位,每个数字、英文标点符号是一个字节,每个中文、中文标点符号是3个字节。length(tagids)表示tagids含有多少个数字和标点符号

replace(tagids,',','')将字符串tagids里的","用空字符来代替,length(replace(tagids,',',''))就表示tagids含有多少个数字。

length(tagids)-length(replace(tagids,',','')) 表示tagsid含有多少个标点符号,那标点符号+1就表示tagids用","分隔符分隔后含有多少个元素,即标签个数。

substring_index 截取字符串

根据关键字","截取字符串

SELECT substring_index('171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227', ',',1)

UNION ALL

SELECT substring_index('171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227', ',',2)

UNION ALL

SELECT substring_index('171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227', ',',3)

UNION ALL

SELECT substring_index('171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227', ',',4)

UNION ALL

......

UNION ALL

SELECT substring_index('171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227',',',61)

结果:

171

171,172

171,172,173

171,172,173,174

......

171,172,173,174,175,184,187,189,191,192,49,52,55,90,96,101,104,110,7,9,253,270,277,280,129,131,134,136,138,139,231,241,58,63,66,70,72,75,77,79,84,149,150,159,163,165,166,193,195,256,225,236,246,248,197,200,207,221,210,278,227

注:substring_index(str,delim,count) 说明:substring_index(被截取字段,关键字,关键字出现的次数),如果count=-1我们就可以截取到倒数第一个被关键字分隔的元素。只要在上面查询结果中再使用一次substring_index即可获得每个被关键字分隔的元素。

最终实现

SELECT

userid,

SUBSTRING_INDEX(

SUBSTRING_INDEX(tagids, ',', seq),

',' ,- 1

) sub_id,

seq

FROM sequence

JOIN (SELECT * FROM `tagids_label` WHERE userid = 2)b

WHERE

seq BETWEEN 1

AND (

SELECT

1 + LENGTH(tagids) - LENGTH(replace(tagids, ',', ''))

)

ORDER BY

userid,

tagids;

结果:

f9e9c586fe06

image.png

f9e9c586fe06

image.png

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值