首先,调包+导入数据,并进行初步处理,本文使用宏观消费数据,并根据cpi值(以100为界限)将数据分为两类。
#####调包
import sklearn
import mglearn
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#######数据处理
df = pd.read_excel(r’C:\Users\14852\Desktop\consumption data.xls’)
for i in range(0,29,1):
if df.loc[i,‘cpi’]<=100: ##将cpi替换为两类(可生成新类别)
df.loc[i,‘cpi’] = 1
elif df.loc[i,‘cpi’]>100:
df.loc[i,‘cpi’] = 0
else:
print(1)
y = np.array(df[‘cpi’])
X = np.array(df.iloc[:,0:2])
df
######聚类过程
from sklearn.neighbors import KNeighborsC