读取数据
In [1]:
import pandas as pdIn [2]:
df = pd.read_csv("/home/mw/input/data9408/shopping_trends.csv") df.head()Out[2]:
Customer ID Age Gender Item Purchased Category Purchase Amount (USD) Location Size Color Season Review Rating Subscription Status Shipping Type Discount Applied Promo Code Used Previous Purchases Payment Method Frequency of Purchases 0 1 55 Male Blouse Clothing 53 Kentucky L Gray Winter 3.1 Yes Express Yes Yes 14 Venmo Fortnightly 1 2 19 Male Sweater Clothing 64 Maine L Maroon Winter 3.1 Yes Express Yes Yes 2 Cash Fortnightly 2 3 50 Male Jeans Clothing 73 Massachusetts S Maroon Spring 3.1 Yes Free Shipping Yes Yes 23 Credit Card Weekly 3 4 21 Male Sandals Footwear 90 Rhode Island M Maroon Spring 3.5 Yes Next Day Air Yes Yes 49 PayPal Weekly 4 5 45 Male Blouse Clothing 49 Oregon M Turquoise Spring 2.7 Yes Free Shipping Yes Yes 31 PayPal Annually 数据预处理
In [3]:
# 缺失值 df.dropna(axis=0, inplace=True) # 删除重复值 df.drop_duplicates(keep="first", inplace=True) # 删除 "Customer ID" 列 df.drop("Customer ID", axis=1, inplace=True) # 列名汉化 df.rename(columns={"Age":"年龄","Gender":"性别","Item Purchased":"购买的商品","Category":"商品类别","Purchase Amount (USD)":"消费金额(美元)",\ "Location":"购买地点","Size":"商品尺码","Color":"商品颜色","Season":"购买商品的季节","Review Rating":"客户评分","Subscription Status":"是否订阅",\ "Shipping Type":"配送方式","Discount Applied":"是否折扣","Promo Code Used":"是否使用优惠码","Previous Purchases":"客户历史购买总数(不包括当前交易)",\ "Payment Method":"支付方式","Frequency of Purchases":"客户购买频率"},inplace=True) df.head()Out[3]:
年龄 性别 购买的商品 商品类别 消费金额(美元) 购买地点 商品尺码 商品颜色 购买商品的季节 客户评分 是否订阅 配送方式 是否折扣 是否使用优惠码 客户历史购买总数(不包括当前交易) 支付方式 客户购买频率 0 55 Male Blouse Clothing 53 Kentucky L Gray Winter 3.1 Yes Express Yes Yes 14 Venmo Fortnightly 1 19 Male Sweater Clothing 64 Maine L Maroon Winter 3.1 Yes Express Yes Yes 2 Cash Fortnightly 2 50 Male Jeans Clothing 73 Massachusetts S Maroon Spring 3.1 Yes Free Shipping Yes Yes 23 Credit Card Weekly 3 21 Male Sandals Footwear 90 Rhode Island M Maroon Spring 3.5 Yes N
客户消费偏好数据实战分析
于 2024-01-11 09:45:46 首次发布