对于数据分析中常见的年龄,消费金额等数据,我们常常会将其按实际业务要求进行切割,python pandas和R语言中均提供了cut函数实现该操作,如何自定义切割点,具体用法如下:
pandas
df["Age"].head()
pd.cut(df['Age'], [-float("inf"), 18, 24, 34, 44, 54, 64, float("inf")], labels=['<18', '18-24', '25-34', '35-44', '45-54', '55-64', '65+'])
R
data1$agecat <-cut(data1$Age,c(-Inf,0,18,24,34,44,54,64,Inf))
summary(data1)