数据来源 :http://archive.ics.uci.edu/ml/index.php 中的一组表格数据(加拿大Montesinho公园)
数据简介:
Montesinho公园地图中的X - x轴空间坐标:1到9
Montesinho公园地图中的 Y轴和y轴空间坐标:2到9
FFMC:细小可燃物含水量,(最大值为101,含水率为0)
DMC:地表可燃物含水率(为0时含水率为100%)
ISI:火灾蔓延潜在等级
DC:干旱码,森林地被物中得含水率
代码:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.cluster import KMeans #引入sklearn模块里的机器学习算法Kmeans
class FireData():
def detectDate(self,filePath):
'''
探索数据
:param filePath: 文件路径
:return:
'''
df = pd.read_csv(filePath)
describe = df.describe(include='all')#同合计数据
print(describe.T)
df.to_excel('data/air_data.xls')
pass
def chooseData(self,filePath):
df = pd.read_excel('data/air_data.xls')
df = df[['FFMC','DMC','DC','ISI','temp','RH','wind','area']]
df.to_excel('data/air_coredata.xls')
pass
def standarData(self,filePath):
'''
一般标准化得方式:(元数据-平均值)/标准差
:param filePath:
:return:
'''
df = pd.read_excel('data/air_coredata.xls')
df = (df - np.mean(df,axis=0))/np.std(df,axis=0)
df[['FFMC','DMC','DC','ISI','temp','RH','wind','area']].to_excel('data/air_stdcoredata.xls')
pass
def classifyData(self,filePath,k=8):
df = pd.read_excel(filePath)
kmeans = KMeans(k)
kmeans.fit(df[['FFMC','DMC','DC','ISI','temp','RH','wind','area']])
df['lable'] = kmeans.labels_
coreData = pd.DataFrame(kmeans.cluster_centers_)
coreData = np.array(kmeans.cluster_centers_)
#绘制雷达图
xdata = np.linspace(0,2*np.pi,k,endpoint=False)
xdata = np.concatenate((xdata,[xdata[0]]))
ydata1 = np.concatenate((coreData[0],[coreData[0][0]]))
ydata2 = np.concatenate((coreData[1], [coreData[1][0]]))
ydata3 = np.concatenate((coreData[2], [coreData[2][0]]))
ydata4 = np.concatenate((coreData[3], [coreData[3][0]]))
print(xdata)
print(ydata1)
fig = plt.figure()
ax = fig.add_subplot(111, polar=True)
ax.plot(xdata, ydata1, 'b--', linewidth=1, label='customer1')
ax.plot(xdata, ydata2, 'r--', linewidth=1, label='customer2')
ax.plot(xdata, ydata3, 'g--', linewidth=1, label='customer3')
ax.plot(xdata, ydata4, 'o--', linewidth=1, label='customer4')
ax.set_thetagrids(xdata * 180 / np.pi, ['FFMC','DMC','DC','ISI','temp','RH','wind','area'], )
ax.set_rlim(-3, 3)
plt.legend(loc='best')
plt.show()
print(xdata)
pass
pass
if __name__ == '__main__':
ad = FireData()
#ad.detectDate('data/forestfires.csv')
#ad.chooseData('data/air_coredata.xls')
#ad.standarData('data/air_stdcoredata.xls')
ad.classifyData('data/air_stdcoredata.xls', k=8)
pass
雷达图:
customer 可以看作是火灾发生时可能造成的毁坏等级 分为四等
代码中数据整合了 烧毁面积 温度 风速 加拿大森林火险气候指数系统FWI 中的一些数值 通过算法进行模拟计算
FWI详情参考:https://wenku.baidu.com/view/73d92adb6037ee06eff9aef8941ea76e58fa4aa2.html 百度问库文章 。
有不足请指出 3Q! 0.o