笔者爬取了【拉勾网】17 个城市的 1600+ 个【数据分析】岗位进行分析并绘图展示,尝试探索该岗位当前市场状况。
ps.目标城市主要挑选的是排名靠前的互联网城市,剔除了个别职位数只有一两个的城市。
数据摘要
import pandas as pd
df_all = pd.read_csv(file, encoding='utf-8')
print(df_all.info())
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1689 entries, 0 to 1688
Data columns (total 20 columns):
position_id 1689 non-null int64
position_name 1689 non-null object
salary 1689 non-null object
work_year 1689 non-null object
education 1689 non-null object
job_nature 1689 non-null object
city 1689 non-null object
district 1689 non-null object
work_addr 1689 non-null object
position_advantage 1689 non-null object
position_labels 1679 non-null object
position_description 1689 non-null object
company_short_name 1689 non-null object
company_size 1689 non-null object
industry_field 1689 non-null object
finance_stage 1689 non-null object
investment_agencies 424 non-null object
company_full_name 1689 non-null object
company_homepage 1689 non-null object
company_inLaGou 1689 non-null object
dtypes: int64(1), object(19)
memory usage: 264.0+ KB