员工离职原因分析_分析员工离职调查

员工离职原因分析

When analyzing employee sentiment data, which in our case is an employee exit survey, we have to look at four topics.

在分析员工情感数据(在我们的案例中是员工离职调查)时,我们必须关注四个主题。

  1. Statistical rigor of the survey

    调查的统计严谨性
  2. Demographical composition of survey respondents

    受访者的人口构成
  3. Overall sentiment for defined latent constructs

    定义的潜在构造的总体情绪
  4. Sentiment scores by respondents’ characteristics (ie. gender, location, department, etc.)

    根据受访者的特征(即性别,位置,部门等)进行情感评分

First, keeping to this methodology will enable us to determine how well our survey is measuring what it is meant to measure. Secondly, by understanding who answered the survey from a respondent characteristics perspective (ie. gender, departments, etc) we can provide context to our analysis and results. Thirdly, this methodology will help us determine the general sentiment of the responders. Last but not least, it will help us determine not only what organization initiatives might be useful to increase sentiment but also where these initiatives should be implemented.

首先,坚持这种方法将使我们能够确定调查在衡量其测量意图方面的程度。 其次,通过从受访者特征角度(即性别,部门等)了解谁回答了调查,我们可以为我们的分析和结果提供背景。 第三,这种方法将帮助我们确定响应者的总体情绪。 最后但并非最不重要的一点是,它不仅可以帮助我们确定哪些组织举措可能有助于增加人们的情绪,还可以确定应该在哪里实施这些举措。

数据集 (Dataset)

The dataset we’ll be using is a fictional employee exit survey which asks the employee a series of questions regarding their organizational demographics (ie. department) and 5-point Likert (ie. Strongly Disagree, Disagree, Neutral, Agree, Strongly Agree) sentiment questions (ie. the organization offered plenty of promotional opportunities). No open-ended questions were utilized.

我们将使用的数据集是一个虚构的员工离职调查,该调查询问员工有关他们的组织人口统计特征(即部门)和五点李克特的一系列问题(即,强烈不同意,不同意,中立,同意,强烈同意)情绪问题(即组织提供了大量促销机会)。 没有使用开放式问题。

数据处理 (Data Processing)

import pandas as pd
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
import seaborn as snsimport warnings
warnings.filterwarnings('ignore')
pd.set_option('display.max_columns', None)
%matplotlib inlinewith open('exit_data_final.csv') as f:
df = pd.read_csv(f)
f.close()df.info()
Image for post
Image for post

We have 33 items or questions which were asked of the employees. Before we can begin our analysis we have a bit of data cleaning to perform.

我们有33个询问员工的项目或问题。 在开始分析之前,我们需要执行一些数据清理。

df.drop('Unnamed: 0', axis=1, inplace=True)

Let’s drop this odd “Unnamed” column as it services no purpose.

让我们删除这个奇怪的“未命名”列,因为它毫无用处。

for var in df.columns:
print(var, df[var].unique())
Image for post

By examining the unique values for each item we can see a few issues.

通过检查每个项目的唯一值,我们可以看到一些问题。

  1. Some items have missing values labeled correctly as np.nan but others are simply null.

    有些项目缺少正确标记为np.nan的值,而另一些则只是空值。
  2. Based on df.info() we need to transform the item types for our Likert items as they are currently formatted as ‘objects’.

    基于df.info(),我们需要转换Likert项目的项目类型,因为它们当前被格式化为“对象”。
  3. Finally, we need to transform some of the values in order to improve the readability of our visualizations.

    最后,我们需要转换一些值,以提高可视化效果的可读性。
# Replacing nulls with np.nan
for var in df.columns:
df[var].replace(to_replace=' ', value=np.nan, inplace=True)# Converting feature types
likert_items = df[['promotional_opportunities', 'performance_recognized',
'feedback_offered', 'coaching_offered', 'mgmt_clear_mission',
'mgmt_support_me', 'mgmt_support_team', 'mgmt_clear_comm',
'direct_mgmt_satisfaction', 'job_stimulating', 'initiative_encouraged',
'skill_variety', 'knowledge_variety', 'task_variety', 'fair_salary',
'teamwork', 'team_support', 'team_comm', 'team_culture',
'job_train_satisfaction', 'personal_train_satisfaction', 'org_culture',
'grievances_resolution', 'co-worker_interaction',
'workplace_conditions', 'job_stress', 'work/life_balance']]for col in likert_items:
df[col] = pd.to_numeric(df[col], errors='coerce').astype('float64')# Discretization of tenure
bins = [0,4,9,14,19,24]
labels = ['0-4yrs', '5-9yrs', '10-14yrs', '15-19yrs', '20+yrs']
df['tenure'] = pd.cut(df['tenure'], bins = bins, labels=labels)

潜在构造的发展 (Development of Latent Constructs)

In my previous article, we reviewed the process of analyzing the statistical rigor (ie. validity, reliability, factor analysis) of our survey. Feel free to review the but let’s quickly review what latent survey constructs are and how they are derived.

在我的上一篇文章中,我们回顾了对调查的统计严谨性(即有效性,可靠性,因子分析)进行分析的过程。 随时查看,但是让我们快速回顾一下潜在的调查构造及其衍生方式。

In order to develop survey items or questions which maintain good statistical rigor, we have to begin with scholarly literature. We want to find a theoretical model that describes the phenomena we wish to measure. For example, personality surveys very often will use the Big-5 model (ie. openness, conscientiousness, extraversion, agreeableness, and neuroticism) to develop the survey items. The survey developer will carefully craft 2–10 (depending on the length of the survey) items for each component of the model. The items which are meant to assess the same component are said to be measuring a “latent construct”. In order words, we are not measuring “extraversion” explicitly as that would be an “observed construct” but indirectly through the individual survey items. The survey is pilot tested with multiple samples of respondents until a certain level of rigor is attained. Once again, if you’re interested in the statistical analyses used to determine rigor take a look at my previous article.

为了开发能够保持良好统计严格性的调查项目或问题,我们必须从学术文献开始。 我们想要找到一个描述我们要测量的现象的理论模型。 例如,性格调查经常会使用Big-5模型(即开放性,尽责性,性格外向,好感和神经质)来开发调查项目。 调查开发人员将为模型的每个组成部分精心制作2-10个项目(取决于调查的时间长短)。 旨在评估同一组成部分的项目据说正在衡量“潜在构成”。 换句话说,我们并不是在明确衡量“外向性”,因为这将是“观察到的结构”,而是通过各个调查项目间接衡量的。 该调查是通过对多个受访者样本进行的先导测试,直到达到一定程度的严格性为止。 再次,如果您对用于确定严谨性的统计分析感兴趣,请参阅我的上一篇文章

# Calculating latent variables
df['employee_valued'] = np.nanmean(df[['promotional_opportunities',
'performance_recognized',
'feedback_offered',
'coaching_offered']], axis=1)df['mgmt_sati'] = np.nanmean(df[['mgmt_clear_mission',
'mgmt_support_me', 'mgmt_support_team', 'mgmt_clear_comm', 'direct_mgmt_satisfaction']], axis=1)df['job_satisfaction'] = np.nanmean(df[['job_stimulating',
'initiative_encouraged','skill_variety','knowledge_variety',
'task_variety']], axis=1)df['team_satisfaction'] = np.nanmean(df[['teamwork','team_support',
'team_comm','team_culture']], axis=1)df['training_satisfaction'] = np.nanmean(df[['job_train_satisfaction',
'personal_train_satisfaction']], axis=1)df['org_environment'] = np.nanmean(df[['org_culture','grievances_resolution',
'co-worker_interaction','workplace_conditions']], axis=1)df['work_life_balance'] = np.nanmean(df[['job_stress','work/life_balance']], axis=1)df['overall_sati'] = np.nanmean(df[['promotional_opportunities', 'performance_recognized','feedback_offered', 'coaching_offered', 'mgmt_clear_mission','mgmt_support_me', 'mgmt_support_team', 'mgmt_clear_comm','direct_mgmt_satisfaction', 'job_stimulating', 'initiative_encouraged','skill_variety', 'knowledge_variety', 'task_variety', 'fair_salary','teamwork', 'team_support', 'team_comm', 'team_culture', 'job_train_satisfaction', 'personal_train_satisfaction', 'org_culture', 'grievances_resolution', 'co-worker_interaction', 'workplace_conditions', 'job_stress', 'work/life_balance']], axis=1)

Our exit survey has also been developed to assess certain latent constructs. Each survey item is averaged in accordance with the latent factor is it meant to measure. Finally, we have calculated an “overall_sati” feature which calculates the grand average across all items/latent factors for each respondent.

我们的出口调查也已经开发出来,以评估某些潜在的构造。 每个调查项目均根据要测量的潜在因子进行平均。 最后,我们计算了一个“ overall_sati”功能,该功能可以计算每个受访者所有项目/潜在因素的总计平均值。

Below is a list of the survey items and the latent construct they are meant to measure. Keep in mind each label for each item has been shortened significantly in order to help facilitate visualizations. You can imagine the items asking questions such as “On a scale of 1–5, I find my job stimulating”.

以下是调查项目和它们将要测量的潜在结构的列表。 请记住,每个项目的每个标签都已大大缩短,以帮助促进可视化。 您可以想象这些项目会问一些问题,例如“在1到5分之间,我发现我的工作很刺激”。

Image for post
mappings = {1:'1) Dissatisfied', 2:'1) Dissatisfied', 3:'2) Neutral', 4:'3) Satisfied', 5:'3) Satisfied'}
likert = ['promotional_opportunities', 'performance_recognized',
'feedback_offered', 'coaching_offered', 'mgmt_clear_mission',
'mgmt_support_me', 'mgmt_support_team', 'mgmt_clear_comm',
'direct_mgmt_satisfaction', 'job_stimulating', 'initiative_encouraged',
'skill_variety', 'knowledge_variety', 'task_variety', 'fair_salary',
'teamwork', 'team_support', 'team_comm', 'team_culture',
'job_train_satisfaction', 'personal_train_satisfaction', 'org_culture',
'grievances_resolution', 'co-worker_interaction',
'workplace_conditions', 'job_stress', 'work/life_balance']for col in likert:
df[col+'_short'] = df[col
  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值