我正在构建函数来帮助我从web加载数据。在加载数据时,我试图解决的问题是列名因源而异。例如,Yahoo Finance的数据列标题看起来像Open、High、Low、Close、Volume、Adj Close。Quandl.com将拥有包含日期、值、日期、值等的数据集。所有大小写都会舍弃所有内容以及VALUE和Adj。接近在很大程度上意味着同样的事情。我想将列与不同的名称相关联,但同一个值的含义相同。例如Adj。Close和value both=AC;Open、Open和Open all=O
所以我有一个Csv文件(“Functions//ColumnNameChanges.txt”)存储dict()键和列名值。Date,D
Open,O
High,H
然后我编写这个函数来填充我的字典def DictKeyValuesFromText ():
Dictionary = {}
TextFileName = "Functions//ColumnNameChanges.txt"
with open(TextFileName,'r') as f:
for line in f:
x = line.find(",")
y = line.find("/")
k = line[0:x]
v = line[x+1:y]
Dictionary[k] = v
return Dictionary
这是打印输出(DictKeyValuesFromText()){'': '', 'Date': 'D', 'High': 'H', 'Open': 'O'}
下一个函数是我的问题所在def ChangeColumnNames(DataFrameFileLocation):
x = DictKeyValuesFromText()
df = pd.read_csv(DataFrameFileLocation)
for y in df.columns:
if y not in x.keys():
i = input("The column " + y + " is not in the list, give a name:")
df.rename(columns={y:i})
else:
df.rename(columns={y:x[y]})
return df
df.rename不工作。这是我得到的输出(changecolumnames(“Tvix_data.csv”))The column Low is not in the list, give a name:L
The column Close is not in the list, give a name:C
The column Volume is not in the list, give a name:V
The column Adj Close is not in the list, give a name:AC
Date Open High Low Close Volume \
0 2010-11-30 106.269997 112.349997 104.389997 112.349997 0
1 2010-12-01 99.979997 100.689997 98.799998 100.689997 0
2 2010-12-02 98.309998 98.309998 86.499998 86.589998 0
列名应该是D,O,H,L,C,V。我遗漏了一些需要帮助的地方。