文章目录
算法实践分析
直接贴一波代码,详细后面再分析
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# pylint: disable=E1101
"""
Created on Sat Nov 4 11:04:32 2017
@author: lu
"""
import pandas as pd
from statsmodels.stats.diagnostic import acorr_ljungbox
from statsmodels.tsa.arima_model import ARIMA
from statsmodels.tsa.stattools import adfuller as ADF
"""
FutureWarning警告:原因未知,在spyder3上运行第二次就消失了,猜测是使用了缓存的原因
attr_trans-->属性变换
programmer_1-->数据筛选
programmer_2-->平稳性检测
programmer_3-->白噪声检测
programmer_4-->确定最佳p、d、q值,有问题!!!
programmer_5-->模型检验
programmer_6-->计算预测误差
"""
PATH = "D:/SoftwareData/Dropbox/MachineLearning/10 kaggleSpareribs/LoadAnalysis/chapter11/"
# 属性变换
def attr_trans(x):
"""==================== 属性变换 ====================
:param x:分组后的每一组数据,DataFrame对象, 二维数组
:return: 返回指定索引的Series对象, 一维数组
"""
# 重新定义列名
result = pd.Series(index=[
"SYS_NAME", "CWXT_DB:184:C:\\", "CWXT_DB:184:D:\\", "COLLECTTIME"
])
result["SYS_NAME"] = x["SYS_NAME"].iloc[0] # "SYS_NAME"列第0个数据
result["COLLECTTIME"] = x["COLLECTTIME"].iloc[0] # "COLLECTTIME"列第0个数据
result["CWXT_DB:184:C:\\"] = x["VALUE"].iloc[0] # "VALUE"列第0个数据
result["CWXT_DB:184:D:\\"] = x["VALUE"].iloc[1] # "VALUE"列第1个数据
return result
def programmer_1():
"""==================== 数据筛选 ====================
:return:
"""