1、unique(),函数输出每个特征的唯一值
例:
for col in data.columns:
print('{} unique element :{}'.format(col,data[col].unique())
输出:
User_ID unique element : [1000001 1000002 1000003 ... 1004113 1005391 1001529]
Product_ID unique element : ['P00069042' 'P00248942' 'P00087842' ... 'P00038842' 'P00295642'
'P00091742']
Gender unique element : ['F' 'M']
Age unique element : ['0-17' '55+' '26-35' '46-50' '51-55' '36-45' '18-25']
Occupation unique element : [10 16 15 7 20 9 1 12 17 0 3 4 11 8 19 2 18 5 14 13 6]
City_Category unique element : ['A' 'C' 'B']
Stay_In_Current_City_Years unique element : ['2' '4+' '3' '1' '0']
Marital_Status unique element : [0 1]
Product_Category_1 unique element : [ 3 1 12 8 5 4 2 6 14 11 13 15 7 16 18 10 17 9]
Product_Category_2 unique element : [ 0. 6. 14. 2. 8. 15. 16. 11. 5. 3. 4. 12. 9. 10. 17. 13. 7. 18.]
Product_Category_3 unique element : [ 0. 14. 17. 5. 4. 16. 15. 8. 9. 13. 6. 12. 3. 18. 11. 10.]
Purchase unique element : [ 8370 15200 1422 ... 14539 11120 18426]
2、unique(),函数输出每一个唯一值出现次数
例:
for col in data.columns:
print('{} unique element:{}'.format(col,data[col].nunique())
输出:
User_ID unique element: 5891
Product_ID unique element: 3623
Gender unique element: 2
Age unique element: 7
Occupation unique element: 21
City_Category unique element: 3
Stay_In_Current_City_Years unique element: 5
Marital_Status unique element: 2
Product_Category_1 unique element: 18
Product_Category_2 unique element: 18
Product_Category_3 unique element: 16
Purchase unique element: 17959