1介绍
源数据集包括四个文件:
其中第一个压缩文件解压后是tsv格式文件
labeledTrainData - The labeled training set. The file is tab-delimited and has a header row followed by 25,000 rows containing an id, sentiment, and text for each review.
testData - The test set. The tab-delimited file has a header row followed by 25,000 rows containing an id and text for each review. Your task is to predict the sentiment for each one.
unlabeledTrainData - An extra training set with no labels. The tab-delimited file has a header row followed by 50,000 rows containing an id and text for each review.
sampleSubmission - A comma-delimited sample submission file in the correct format.
数据导入:OSError: Initializing from file failed
问题原因:复制文件路径的时候(win10)从地址栏直接复制过去的,