Hadoop项目实战3—招聘数据预处理
1、引入库
import pandas as pd
import numpy as np
from pyecharts.charts import *
from pyecharts import options as opts
2. 查看数据
3.缺失值处理
boss.isnull().sum()
boss_del = boss.dropna(subset=['post'], axis=0)
boss_del.dropna(subset=['company'], axis=0)
boss['skill'].fillna('无', inplace=True)
4.城市分组
boss['city'].unique()
boss['city'] = boss['city'].apply(lambda x:x.split('·')[0])
boss['city'].unique()
5.学历分组
6.工作经验分组