我是python新手。我正在写一个脚本,从网站上提取一些数据并绘制图表。但是,我的代码出错了,说数据类型不正确。具体地说,我有十进制值代表“value”和日期代表“year”。我试图重新定义它们,但我认为我把定义放错地方了。任何帮助将不胜感激,代码如下。在import numpy as np
import pandas as pd
import json
import matplotlib.pyplot as mp
from IPython.display import HTML
import getpass
import requests
def frame(url, height=400, width=100):
display_string = '
'.format(url=url, w=width, h=height)
return HTML(display_string)
frame('https://data.bls.gov/registrationEngine/')
registration_key = getpass.getpass('Enter Registration Key: ')
series = 'MPU4900012'
frame('https://api.bls.gov/publicAPI/v1/timeseries/data/')
def capture_series(series, start, end, key=registration_key):
url = 'https://api.bls.gov/publicAPI/v2/timeseries/data/'
url += '?registrationkey={key}'.format(key=key)
data = json.dumps({
"seriesid": [series],
"startyear": str(start),
"endyear": str(end)
})
headers = {
"Content-type": "application/json"
}
result = requests.post(url, data=data, headers=headers)
return json.loads(result.text)
json_data = capture_series(series, 1987, 2016)
json_data
df_data = pd.DataFrame(json_data['Results']['series'][0]['data'])
print(df_data)
df_sub = df_data[['value', 'year']].astype(float).astype(int)
df_sub.set_index('year', inplace=True)
df_sub.sort_index(inplace=True)
df_sub
x = df_sub.index
y = df_sub['value']
mp.plot(x,y)
mp.title('Major Sector Multifactor Productivity')
mp.xlabel('years')
mp.ylabel('values')
mp.show
当我运行代码时,我首先得到这个表,这是站点数据。在
^{pr2}$
错误日志显示了这一点(使用Jupyter w/python3作为参考)ValueError Traceback (most recent call last)
in ()
41 print(df_data)
42
---> 43 df_sub = df_data[['value', 'year']].astype(int)
44 df_sub.set_index('year', inplace=True)
45 df_sub.sort_index(inplace=True)
...
ValueError: invalid literal for int() with base 10: '86.244'