python遍历data、并输出结果_遍历python列表的最佳方法是什么，排除某些值并输出结果...

最新推荐文章于 2021-01-30 05:08:02 发布

weixin_39680208

最新推荐文章于 2021-01-30 05:08:02 发布

阅读量118

点赞数

文章标签： python遍历data、并输出结果

博主分享了在Python中处理文本数据时遇到的问题，如何从包含20条推文的文本文件中提取信息。尽管已经尝试过多种方法，如查找相似问题、查阅文档和使用列表推导式，但仍然无法有效过滤掉非英文字符、以'Photo:'开头的字符串、'None'值。提出了希望删除这些不需要的数据并排除非Unicode数据的需求。解决方案是使用一个名为`legit`的函数进行过滤，通过列表推导式实现。

摘要由CSDN通过智能技术生成

I am new to python and have a question:

I have checked similar questions, checked the tutorial dive into python, checked the python documentation, googlebinging, similar Stack Overflow questions and a dozen other tutorials.

I have a section of python code that reads a text file containing 20 tweets. I am able to extract these 20 tweets using the following code:

with open ('output.txt') as fp:

for line in iter(fp.readline,''):

Tweets=json.loads(line)

data.append(Tweets.get('text'))

i=0

while i < len(data):

print data[i]

i=i+1

The above while loop iterates perfectly and prints out the 20 tweets (lines) from output.txt.

However, these 20 lines contain Non-English Character data like "Los ladillo a los dos, soy maaaala o maloooooooooooo", URLs like "http://t.co/57LdpK", the string "None" and Photos with a URL like so "Photo: http://t.co/kxpaaaaa(I have edited this for privacy)

I would like to purge the output of this (which is a list), and exclude the following:

The None entries

Anything beginning with the string "Photo:"

It would be a bonus also if I can exclude non-unicode data

I have tried the following bits of code

Using data.remove("None:") but I get the error list.remove(x): x not in list.

Reading the items I do not want into a set and then doing a comparison on the output but no luck.

Researching into list comprehensions, but wonder if I am looking at the right solution here.

I am from an Oracle background where there are functions to chop out any wanted/unwanted section of output, so really gone round in circles in the last 2 hours on this. Any help greatly appreciated!

解决方案

Try something like this:

def legit(string):

if (string.startswith("Photo:") or "None" in string):

return False

else:

return True

whatyouwant = [x for x in data if legit(x)]

I'm not sure if this will work out of the box for your data, but you get the idea. If you're not familiar, [x for x in data if legit(x)] is called a list comprehension