python读取字符串中的单词_使用python将字符串中的单词替换为列表中的单词

1586010002-jmsa.png

I'm working on creating a word cloud program in Python and I'm getting stuck on a word replace function. I am trying to replace a set of numbers in an html file (so I'm working with a string) with words from an ordered list. So 000 would be replaced with the first word in the list, 001 with the second, etc.

So below I have it selecting the word to replace w properly but I can't get it to properly replace the it with the words from the string. Any help is appreciated. Thanks!

def replace_all():

text = '000 001 002 003 '

word = ['foo', 'bar', 'that', 'these']

for a in word:

y = -1

for w in text:

y = y + 1

x = "00"+str(y)

w = {x:a}

for i, j in w.iteritems():

text = text.replace(i, j)

print text

解决方案

This is actually a really simple list comprehension:

>>> text = '000 001 002 003 '

>>> words = ['foo', 'bar', 'that', 'these']

>>> [words[int(item)] for item in text.split()]

['foo', 'bar', 'that', 'these']

Edit: If you need other values to be left alone, this can be catered for:

def get(seq, item):

try:

return seq[int(item)]

except ValueError:

return item

Then simply use something like [get(words, item) for item in text.split()] - naturally, more testing might be required in get() if there will be other numbers in the string that could get accidentally replaced. (End of edit)

What we do is split the text into the individual numbers, then convert them to integers and use them to index the list you have given to find words.

As to why your code doesn't work, the main issue is you are looping over the string, which will give you characters, not words. However, it's not a great way of solving the task.

It's also worth a quick note that when you are looping over values and want indices to go with them, you should use the enumerate() builtin rather than using a counting variable.

E.g: Instead of:

y = -1

for w in text:

y = y + 1

...

Use:

for y, w in enumerate(text):

...

This is much more readable and Pythonic.

Another thing with your existing code is this:

w = {x:a}

for i, j in w.iteritems():

text = text.replace(i, j)

Which, if you think about it, simplifies down to:

text = text.replace(x, a)

You are setting w to be a dictionary of one item, then looping over it, but you know it will only ever contain one item.

A solution that follows your method more closely would be something like this:

words_dict = {"{0:03d}".format(index): value for index, value in enumerate(words)}

for key, value in words_dict.items():

text = test.replace(key, value)

We create a dictionary from the zero padded number string (using str.format()) to the value, then replace for each item. Note as you are using 2.x, you'll want dict.iteritems(), and if you are pre-2.7, use the dict() builtin on a generator of tuples as dict comprehensions don't exist.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值