我想将.dat文件的数据集转换为csv文件。数据格式看起来像
Each row begins with the sentiment score followed by the text associated with that rating.
我希望有情感值(-1或1)有一个列,与情感值对应的评论文本有一个评论栏。在
我目前所做的努力import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import csv
# read flash.dat to a list of lists
datContent = [i.strip().split() for i in open("train.dat").readlines()]
# write it as a new CSV file
with open("train.csv", "wb") as f:
writer = csv.writer(f)
writer.writerows(datContent)
def your_func(row):
return row['Sentiments'] / row['Review']
columns_to_keep = ['Sentiments', 'Review']
dataframe = pd.read_csv("train.csv", usecols=columns_to_keep)
dataframe['new_column'] = dataframe.apply(your_func, axis=1)
print dataframe
结果的屏幕截图示例火车.csv评论中的每个单词后面都有一个逗号。在