问题:使用scipy包中的loadarff读取arff文件,报错ValueError: {60 1 value not in ('0', '1')
MULAN包网站所提供的数据集(http://mulan.sourceforge.net/datasets-mlc.html)
解决:使用skmultilearn包中的load_from_arff读取
参见https://stackoverflow.com/questions/49828401/how-to-read-sparse-arff-data-using-python-libraries
from skmultilearn.dataset import load_from_arff
X, y = load_from_arff("/home/user/data/delicious-train.arff",
# number of labels
label_count=983,
# MULAN format, labels at the end of rows in arff data
endian='little',
# bag of words
input_feature_type='int', encode_nominal=False,
# sometimes the sparse ARFF loader is borked, like in delicious,
# scikit-multilearn converts the loaded data to sparse representations,
# so disabling the liac-arff sparse loader
load_sparse=False,
# this decides whether to return attribute names or not, usually
# you don't need this
return_attribute_definitions=False)