orange 数据分析_使用Orange GUI的放置结果数据分析

本文介绍了如何利用Orange数据分析工具对放置结果进行深入分析。通过Orange的图形用户界面,可以轻松地对大数据集进行预处理、探索性分析,并应用于人工智能模型,以揭示隐藏的模式和洞察。
摘要由CSDN通过智能技术生成

orange 数据分析

Objective : Analysing of several factors influencing the recruitment of students and extracting information through plots.

目的:分析影响学生招生和通过情节提取信息的几个因素。

Description : The following analysis presents the different plots that attempts to link students’ placement prospects, made possible through student perceptions of recruiting organisations to certain academic parameters such as percentage obtained in secondary and higher secondary school, undergraduate degree and post graduation degree.

Description(说明) :以下分析提出了不同的图,这些图试图通过将学生对招募组织的理解与某些学术参数(例如,在中学和高中获得的百分比,大学学位和毕业学位)的理解联系起来,从而尝试联系学生的就业前景。

Miscellaneous factors such as the gender of the candidate, the choice of board for and the stream opted for in high school and secondary education, undergraduate degree specialisation and post graduate degree specialisation have also been taken into account to predict placement status as well as salary offered.

还考虑了其​​他因素,例如候选人的性别,高中和中等教育的董事会选择和选择的职位,本科学位专业和研究生学位专业,以预测安置状况以及所提供的薪水。

Several colleges offer employ-ability tests which serve as a way of helping the employers evaluate their workforce, analyse and judge their skills and hence recruit the right talent. Thus, performance of students in such tests conducted by the college and their previous work experience have also been analysed to deduce their relation with recruitment opportunities.

几所大学提供就业能力测试,以帮助雇主评估其劳动力,分析和判断其技能,从而招募合适的人才。 因此,还对学生在大学进行的此类测试中的表现以及他们以前的工作经验进行了分析,以推断出他们与招聘机会的关系。

Hypothesis : Students with better scores in secondary education and undergraduate degree have better prospects of getting placed.

假设 :中学教育和大学学位较高的学生有更好的入学前景。

Understanding the Project :

了解项目

Going through the analysis, a reader shall be able to infer :

通过分析,读者应能够推断:

  1. How the choice of board of education influences placement prospects.

    教育委员会的选择如何影响安置前景。
  2. The relative importance of scores obtained in various degrees and streams in campus recruitment procedure.

    在校园招聘过程中,不同程度和不同等级获得的分数的相对重要性。
  3. The relation between gender and work experience with salary offered by corporate on campus placements.

    性别和工作经验与公司在校园安置中提供的薪水之间的关系。

Acknowledgements:

致谢:

Myself Ruchika Parag Barman and my team mate Prafful Chauhan created this notebook/blog as part of the course work under “Pandas, bamboolib & Orange workshop” at Suven, under mentor-ship of Rocky Jagtiani .

我自己的Ruchika Parag Barman和我的队友Prafful ChauhanRocky Jagtiani的指导下,在Suven的 “熊猫,竹筒和橙子工作坊”下创建了此笔记本/博客,作为该课程工作的一部分。

Learned from https://datascience.suvenconsultants.com.

https://datascience.suvenconsultants.com了解到。

Mentored by Rocky Jagtiani.

Rocky Jagtiani指导

Dataset:

资料集:

This data set consists of Placement data of students in a XYZ campus. It includes secondary and higher secondary school percentage and specialization. It also includes degree specialization, type and Work experience and salary offers to the placed students.

此数据集包含XYZ校园中学生的安置数据。 它包括中学和高中的百分比和专业。 它还包括学位专业化,类型和工作经验以及向所安置学生提供的薪水。

Image for post

We have taken 60 observations (no of rows) from which we are extract information through exploratory data analysis and visualization. There are 8 categorical features and 6 numerical features.

我们采用了60个观测值(无行),通过探索性数据分析和可视化从中提取信息。 有8个分类特征和6个数字特征。

Histograms :

直方图:

Image for post

Inference : Male students are getting more placements than female students and the ratio of male to female in placements is almost around 2:1.

推论 学生变得比女生更多的展示位置男性的比例, 女性 配股几乎是2:1左右。

Image for post

Inference : We can inspect that with respect to high school education, Central board students have wider range of salary than the other board students but placement ratio central to others is less than 1.

推论 :我们可以检查到,就高中教育而言, 中央董事会学生的薪资范围比其他董事会学生要大,但相对于其他人而言, 中心 职位的就业率低于1。

Image for post

Inference : We can inspect that with respect to secondary education, Central board students have wider range of salary than the other board students.

推论 :我们可以检查到,就中等教育而言, 中央董事会学生的薪资范围比其他董事会学生要广。

Image for post

Inference : Commerce and Arts students have wider range of salary and number of placed students are more as compared to science or other stream.

推论 :与理科其他 专业相比, 商科文科生的薪资范围更广,安置学生的数量也更多。

From the above graphs, one can gather that gender plays quite an important role in whether or not a candidate will be hired. It is more likely for a male candidate to get placed at a corporate as compared to a female candidate. Similarly, the board of education and the stream chosen also determine salary offered. Students have been proposed higher amounts of pay that opted for Commerce and Management studies.

从以上图表可以看出,性别在是否应聘者中起着非常重要的作用。 与女性候选人相比,男性候选人更有可能被安置在公司。 同样,教育委员会和所选择的职位也决定了提供的薪水。 建议学生选择更高的薪水,选择商务和管理学习。

Correlations :

相关性

Image for post

The correlations table gives us the following ideas :

相关表为我们提供了以下想法:

  1. Students who have scored well in their secondary education are very likely to perform well in their undergraduate degree also.

    中学教育中取得良好成绩的学生,其本科学位也很可能会表现良好。

  2. Students who have scored well in their high school education eventually perform well in their secondary education also.

    高中阶段成绩良好的学生最终在中等教育方面也表现良好。

  3. Again, students who have scored well in their high school education are very likely to perform well in their undergraduate degree also.

    同样,在高中阶段取得良好成绩的学生也很可能在本科学位上表现良好。

  4. Most students who have had a good academic record in their high school education also score high in their MBA degree.

    大多数高中学历良好的学生的MBA学位也很高。

Boxplots :

箱线图

Image for post

Inference : The above boxplot shows the relation between percentage obtained in the undergraduate degree and placement status. Students who get placed score higher than those who do not get placed. The mean score of placed students is given by 68.6925, standard deviation is 6.189 ,2nd quartile or median is 69.25 ,1st quartile is 64.50 and 3rd quartile is 72.1150.

推论 :上面的方框图显示了本科学位所占百分比升学状况的关系 。 被安置的学生的得分高于没有被安置的学生。 留学生的平均分数为68.6925,标准差为6.189,第二四分位数或中位数为69.25,第一四分位数为64.50,第三四分位数为72.1150。

Whereas, the mean percentage of students not placed is given by 60.8670, standard deviation is 7.045, 2nd quartile or median is 61.00, 1st quartile is 56.65 and 3rd quartile is 64.00.

而未安置学生的平均百分比为60.8670,标准差为7.045,第二四分位数或中位数是61.00,第一四分位数是56.65,第三四分位数是64.00。

From this analysis, undergraduate students/freshers can prioritise and prepare for their undergraduate/degree examinations keeping in mind the average score, as mentioned above, that the corporate companies generally perceive worthy of grabbing a placement in their establishment.

通过这种分析,本科生/新生可以优先考虑并为本科生/学位考试做准备,同时牢记如上所述的平均分数,即公司通常认为值得在其机构中获得职位。

Image for post

Inference : Male candidates get a higher pay than female candidates. The mean salary of placed male students is given by 302608.70 , standard deviation is 144726.4 , 2nd quartile or median is 264000, 1st quartile is 240000 and 3rd quartile is 300000.

推论男性候选人的薪酬高于女性候选人 。 入学男生的平均工资为302608.70,标准差为144726.4,第二四分位数或中位数为264000,第一四分位数为240000,第三四分位数为300000。

On the other hand, the mean salary of placed female students is given by 267571.43, standard deviation is 41776.1, 2nd quartile or median is 250000 ,1st quartile is 240000 and 3rd quartile is 300000.

另一方面,入职女学生的平均工资为267571.43,标准差为41776.1,第二四分位数或中位数为250000,第一四分位数为240000,第三四分位数为300000。

Thus, we can see that while the placement rate of females is lower than males, the salary offered to the placed female candidates is also relatively lower than that of the male candidates.

因此,我们可以看到,尽管女性的就业率低于男性,但提供给被安置的女性候选人的薪水也相对低于男性候选人。

Pivot Table :

数据透视表

Image for post

Inference : As more students opt for Commerce and Management, the no. of placed students as well as students not placed are much higher in it as compared to Science and other streams. Even the ratio of placed to students not placed is higher in Commerce and Management is higher than that in Science.

推论 :随着越来越多的学生选择商业与管理 ,不 与理科和其他科目相比, 录取学生和未录取学生的比例要高得多。 即使在商务和管理领域,就读率和未就读率之间的比重也更高,而在理科中则更高。

Readers can understand there are relatively more job opportunities for students who opt for Commerce and Management than other streams.

读者可以理解,选择商业和管理专业的学生比其他领域的工作机会相对更多。

Scatterplots :

散点图

Image for post

For scatterplots, we have used 60% of the data provided. A scatterplot with variables salary and percentage obtained in the degree examination is formed. Here,the different points have been coloured according to the different streams as shown in the legends table.

对于散点图,我们使用了提供的60%的数据。 形成了在学位考试中获得的 薪水百分比可变的散点图。 在这里,不同的点已根据图例表中所示的不同流进行了着色。

Inference : The higher salaries have been offered to students whose scores lie in the range 64–74. Moreover, from the point of stream, most of the students that have been offered a pay higher than 300,000 belong to Commerce and Management. Very few students of Science and even fewer students of other streams have crossed the threshold of 300,000 pay.

推论 :为分数在64-74之间的学生提供了更高的薪水 。 而且,从的角度来看,获得超过30万薪水的大多数学生属于商业与管理专业。 理科专业的学生很少,其他流派的学生甚至超过了30万。

Image for post

Inference : Students that specialise in Marketing and Finance and those in Marketing and HR score similarly in MBA percentage. However, the highest paid students generally have scores in the range 62–70, approximately. Very few students have been offered a pay higher than 400,000. Majority of students are offered salaries in the range of 250,000 to 350,000.

推论市场营销与金融专业的学生, 市场 营销与人力资源 专业的MBA百分比得分相似。 但是,收入最高的学生的分数通常在62-70之间。 很少有学生获得高于40万的薪水。 大多数学生的薪水在250,000到350,000之间。

We can understand that maintaining an average score that falls in the above mentioned range shall suffice for a decent paying placement.

我们可以理解,将平均得分保持在上述范围内就足以获得不错的付费。

Mosaic Plot :

马赛克图

Image for post

Other than academic parameters, some other factors may also be considered for placement by recruiting companies. Employablity tests conducted by colleges are key for establishing appropriate labour market linkages and ascertaining that the workforce is industry ready.

除了学术参数,其他一些因素也可以由招聘单位考虑的位置 。 高校进行的能力测试对于建立适当的劳动力市场联系并确定劳动力已做好行业准备至关重要。

Inference: From the plot above, we can see that of all the students that did not get placed, very few scored above 83.5. Most of the unemployed candidates scored below 83.5.

推论 :从上图可以看出,在所有未获得排名的学生中,只有极少数得分高于83.5。 大多数失业候选人的得分都低于83.5。

Moreover, the plot suggests that students having prior work experience are considered more deserving than freshers. Nearly all the sections of students not placed did not have a prior work experience, whereas those having work experience are on the placed students section on the right.

此外,该图表明,具有过往工作经验的学生被认为比新生更值得。 几乎所有未安置学生的部分都没有事先的工作经验,而那些有工作经验的学生则在右侧的已安置学生部分。

From this, students can comprehend that having an experience in a work environment before campus recruitment proves to be beneficial. Thus, they can plan and prepare accordingly for their future.

由此,学生可以理解,在校园招聘之前的工作环境中的经验被证明是有益的。 因此,他们可以为自己的未来做计划并作相应的准备。

Classification Tree :

分类树

Image for post

This classification tree has placement status (placed) as target .It has the following parameters:

该分类树以放置状态(已放置)为目标,具有以下参数:

It is an induced binary tree.

它是一个诱导二叉树。

Minimum no. of instances in leaves : 2.

最低编号 叶子中的实例数量:2。

Do not split subsets more than :5.

子集分割不要超过:5。

Limit the maximal tree depth to : 100.

将最大树深度限制为:100。

Classification stops when majority reaches 95%.

当多数达到95%时,分类将停止。

Students can acquire a detailed analysis about the dependence of the various academic and other factors on whether or not a candidate gets placed based on the data provided. This tree gives a clear explanation of how the different attributes of a particular student shall influence their placement status.

学生可以根据所提供的数据,详细了解各种学术因素和其他因素对候选人是否被安置的依赖性。 该树清楚地解释了特定学生的不同属性如何影响他们的位置状况

Image for post

This classification tree has salary offered as target .It has the following parameters:

此分类树以薪金为目标,它具有以下参数:

It is an induced binary tree.

它是一个诱导二叉树。

Minimum no. of instances in leaves : 2.

最低编号 叶子中的实例数量:2。

Do not split subsets more than :5.

子集分割不要超过:5。

Limit the maximal tree depth to : 100.

将最大树深度限制为:100。

Classification stops when majority reaches 95%.

当多数达到95%时,分类将停止。

Students can acquire a detailed analysis about the dependence of the various academic and other factors on the salary offered to a candidate. This tree gives a clear explanation of how the different attributes of a particular student shall influence their pay.

学生可以获得有关各种学术和其他因素对应聘者薪水的依赖性的详细分析。 这棵树清楚地说明了特定学生的不同属性将如何影响他们的工资。

Vote of Thanks :

感谢票:

I would like to humbly and sincerely thank my mentor Rocky Jagtiani. He is more of a friend to me than mentor .The data analytics taught by him and various assignments we did and are still doing is the best way to learn and skill in Data Science field.

我要衷心地感谢我的导师 洛基 对于我而言,他不是导师,而是导师。他教给我们的数据分析以及我们目前做的和仍在做的各种作业是在数据科学领域学习和技能的最佳方法。

Recommended https://datascience.suvenconsultants.com/

推荐的 https://datascience.suvenconsultants.com/

翻译自: https://medium.com/@ruchikaparag18/placement-outcomes-data-analysis-using-orange-gui-1884aa3ac0c2

orange 数据分析

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值