利用Kmeans聚类进行用户分层分析

最新推荐文章于 2023-08-01 18:26:05 发布

努力再努力1

最新推荐文章于 2023-08-01 18:26:05 发布

阅读量1.8k

点赞数 2

分类专栏：机器学习文章标签：聚类 kmeans 数据挖掘 python 机器学习

本文链接：https://blog.csdn.net/m0_69435474/article/details/125259188

版权

数据来源：和鲸社区这里这里

一、数据浅析

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import warnings
import seaborn as sns
import re
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
from sklearn.preprocessing import StandardScaler, MinMaxScaler
plt.style.use('ggplot')
warnings.filterwarnings('ignore')
plt.rcParams['font.sans-serif'] =['Microsoft YaHei']
plt.rcParams['axes.unicode_minus'] = False

data = pd.read_csv('超市数据.csv',sep=',',encoding='gbk') # 默认utf-8编码,读取数据出错,gbk编码读取成功
data['性别'] = np.where(data['性别']=='Male',1,0)
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
 #   Column   Non-Null Count  Dtype
---  ------   --------------  -----
 0   用户id     200 non-null    int64
 1   性别       200 non-null    int32
 2   年龄       200 non-null    int64
 3   年收入(k$)  200 non-null    int64
 4   消费得分     200 non-null    int64
dtypes: int32(1), int64(4)
memory usage: 7.2 KB

1.1、数据相关性分析

sns.pairplot(data.iloc[:,2:])
plt.show()

在这里插入图片描述

1.2、年龄、年收入、消费得分直方图分析

fig, ax =plt.subplots(1,3,constrained_layout=True, figsize=(12, 4))
axesSub = sns.distplot(data['年龄'],ax=ax[0

最低0.47元/天解锁文章

努力再努力1

关注

2
点赞
踩
35

收藏

觉得还不错? 一键收藏
0
评论
利用Kmeans聚类进行用户分层分析

利用Kmeans聚类法对超市的用户进行分层分析，特征包括：年龄、年收入和消费水平。先对数据进行大概的了解：数据分布、特征关系(气泡图)，直方图等；最后分别根据双特征和三特征进行用户分层。...
复制链接

扫一扫