mysql查询带斜杠,使用mySQL语句查找带有斜杠的几乎重复的数据

I am have a table named 'LINK_INFO' with URLs in a field called 'URL'. The problem is, many duplicates URLs exist EXCEPT some have used a trailing / to get around the unique field requirement.

Example:

What is the statement I can use to select these cases of near duplicates, so I can delete one of them? Many thanks if you can help.

解决方案

You can just use TRIM to find all unique values;

SELECT DISTINCT TRIM(TRAILING '/' FROM url) url

FROM link_info

To delete the duplicates right away, just do a delete join;

DELETE li1

FROM link_info li1

JOIN link_info li2

WHERE TRIM(TRAILING '/' FROM li1.url) =

TRIM(TRAILING '/' FROM li2.url)

AND li1.id

Always back up your tables before running arbitrary SQL found on the net, even mine :)

EDIT: If your database machine is limited, you may want to do it using indexes and avoid loading more into memory than necessary;

-- remove all trailing slashes

UPDATE link_info

SET url=TRIM(TRAILING '/' FROM url);

-- create an index on the resulting strings (if there isn't already one)

CREATE INDEX url_index ON link_info(url);

-- delete all duplicates

DELETE li1

FROM link_info li1

JOIN link_info li2

WHERE li1.url = li2.url

AND li1.id

-- drop the index if not needed anymore

DROP INDEX url_index ON link_info;

Yet another SQLfiddle.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值