使用multiprocessing遇到两个问题:①内存爆掉②找不到变量
原因如下:
1.由于你下面调用multiprocessing的代码没有保护,在新进程加载这个模块的时候会重新执行这段代码,创建出新的multiprocessing池,无限调用下去。
解决这个问题的方法是永远把实际执行功能的代码加入到带保护的区域中:if __name__ == ‘__mian__’:
2.Windows下面的multiprocessing跟Linux下面略有不同,Linux下面基于fork,fork之后所有的本地变量都复制一份,因此可以使用任意的全局变量;在Windows下面,多进程是通过启动新进程完成的,所有的全局变量都是重新初始化的,在运行过程中动态生成、修改过的全局变量是不能使用的。
原代码
global n_users, n_items
n_items = n_params['n_items']
n_users = n_params['n_users']
global train_user_set, test_user_set
train_user_set = user_dict['train_user_set']
test_user_set = user_dict['test_user_set']
batch_result = pool.map(test_one_user, user_batch_rating_uid)
def test_one_user(x):
# user u's ratings for user u
rating = x[0]
# uid
u = x[1]
# user u's items in the training set
try:
training_items = train_user_set[u]
except Exception:
training_items = []
# user u's items in the test set
user_pos_test = test_user_set[u]
all_items = set(range(0, n_items))
test_items = list(all_items - set(training_items))
if args.test_flag == 'part':
r, auc = ranklist_by_heapq(user_pos_test, test_items, rating, Ks)
else:
r, auc = ranklist_by_sorted(user_pos_test, test_items, rating, Ks)
return get_performance(user_pos_test, r, auc, Ks)
发生错误后,不用全局变量,传值进函数
#global n_users, n_items
n_items = n_params['n_items']
n_users = n_params['n_users']
#global train_user_set, test_user_set
train_user_set = user_dict['train_user_set']
test_user_set = user_dict['test_user_set']
batch_result = pool.starmap(test_one_user, [(x, train_user_set, test_user_set, n_users, n_items) for x in user_batch_rating_uid])
# 函数加几个参数
def test_one_user(x, train_user_set, test_user_set, n_users, n_items):
...