python重复字符串重命名_Python:使用渐进编号重命名列表中的重复项,而不对列表进行排序...

Given a list like this:

mylist = ["name", "state", "name", "city", "name", "zip", "zip"]

I would like to rename the duplicates by appending a number to get the following result:

mylist = ["name1", "state", "name2", "city", "name3", "zip1", "zip2"]

I do not want to change the order of the original list. The solutions suggested for this related Stack Overflow question sorts the list, which I do not want to do.

解决方案

This is how I would do it. EDIT: I wrote this into a more generalized utility function since people seem to like this answer.

mylist = ["name", "state", "name", "city", "name", "zip", "zip"]

check = ["name1", "state", "name2", "city", "name3", "zip1", "zip2"]

copy = mylist[:] # so we will only mutate the copy in case of failure

from collections import Counter # Counter counts the number of occurrences of each item

from itertools import tee, count

def uniquify(seq, suffs = count(1)):

"""Make all the items unique by adding a suffix (1, 2, etc).

`seq` is mutable sequence of strings.

`suffs` is an optional alternative suffix iterable.

"""

not_unique = [k for k,v in Counter(seq).items() if v>1] # so we have: ['name', 'zip']

# suffix generator dict - e.g., {'name': , 'zip': }

suff_gens = dict(zip(not_unique, tee(suffs, len(not_unique))))

for idx,s in enumerate(seq):

try:

suffix = str(next(suff_gens[s]))

except KeyError:

# s was unique

continue

else:

seq[idx] += suffix

uniquify(copy)

assert copy==check # raise an error if we failed

mylist = copy # success

If you wanted to append an underscore before each count, you could do something like this:

>>> mylist = ["name", "state", "name", "city", "name", "zip", "zip"]

>>> uniquify(mylist, (f'_{x!s}' for x in range(1, 100)))

>>> mylist

['name_1', 'state', 'name_2', 'city', 'name_3', 'zip_1', 'zip_2']

...or if you wanted to use letters instead:

>>> mylist = ["name", "state", "name", "city", "name", "zip", "zip"]

>>> import string

>>> uniquify(mylist, (f'_{x!s}' for x in string.ascii_lowercase))

>>> mylist

['name_a', 'state', 'name_b', 'city', 'name_c', 'zip_a', 'zip_b']

NOTE: this is not the fastest possible algorithm; for that, refer to the answer by ronakg. The advantage of the function above is it is easy to understand and read, and you're not going to see much of a performance difference unless you have an extremely large list.

EDIT: Here is my original answer in a one-liner, however the order is not preserved and it uses the .index method, which is extremely suboptimal (as explained in the answer by DTing). See the answer by queezz for a nice 'two-liner' that preserves order.

[s + str(suffix) if num>1 else s for s,num in Counter(mylist).items() for suffix in range(1, num+1)]

# Produces: ['zip1', 'zip2', 'city', 'state', 'name1', 'name2', 'name3']

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值