我正在尝试学习
python语言及其概念.我写了一些代码来玩多线程.但我注意到多线程和单线程之间没有执行时间差异.
运行脚本的机器有4个核心/线程.
def get_tokens(file_name,map):
print(file_name)
counter = 0
with open(file_name,'r',encoding='utf-8-sig') as f:
for line in f:
item = json.loads(line,encoding='utf-8')
if 'spot' in item and item['sid'] == 4663:
counter+=1
if counter == 500:
break
tokens = nltk.word_tokenize(item['spot'],language='english')
for token in tokens:
if token not in map:
map[token] = 1
else:
map[token] = map[token] + 1;
start_time = time.time()
map = dict();
with ThreadPoolExecutor(max_workers=3) as executor:
for file in FileProcessing.get_files_in_directory('D:\\Raw Data'):
future = executor.submit(FileProcessing.get_tokens, file, map)
end_time = time.time()
print("Elapsed time was %g seconds" % (end_time - start_time))
原始数据中的每个文件大小都大于25 MB.所以我认为它们之间必须存在差异.但事实并非如此.为什么?我在代码或多线程概念中犯了错误吗?