数据eda_关于分类和有序数据的EDA

数据eda

数据科学和机器学习统计 (STATISTICS FOR DATA SCIENCE AND MACHINE LEARNING)

Categorical variables are the ones where the possible values are provided as a set of options, it can be pre-defined or open. An example can be the gender of a person. In the case of Ordinal variables, the options can be ordered by some rule, like the Likert Scale:

分类变量是将可能的值作为一组选项提供的变量,可以预定义或打开。 一个例子可以是一个人的性别。 对于序数变量,可以按照某些规则对选项进行排序,例如Likert Scale:

  • Like

    喜欢
  • Like Somewhat

    有点像
  • Neutral

    中性
  • Dislike Somewhat

    有点不喜欢
  • Dislike

    不喜欢

To simplify further examples, we will use a simple example, based on a group of students that have passed or not 2 distinct exams, the results are represented in the next RxC table:

为了简化更多示例,我们将使用一个简单示例,该示例基于一组已通过或未通过2次不同考试的学生,结果显示在下一个RxC表中:

Image for post
The example used in the whole article, self-generated.
整篇文章中使用的示例是自生成的。

Statisticians have developed specific techniques to analyze this data, the most important are:

统计人员已经开发出分析此数据的特定技术,其中最重要的是:

协议措施 (Measures of Agreement)

百分比协议 (Percent Agreement)

Calculated as the divisions between the number of cases where the rates are in a certain class by the total number of rates.

计算为费率在特定类别中的案例数除以费率总数。

Image for post
Adding totals to the example, self-generated.
将总计添加到示例中,自行生成。
  • The percent agreement for Passing the exam 2 is 25/(25+60) = 0.29, so 29.4%

    通过考试2的百分比协议是25 /(25 + 60)= 0.29,所以29.4%
  • The percent agreement for Passing the exam 1 is 30/85 = 0.35, so 35.3%

    通过考试1的百分比协议是30/85 = 0.35,所以35.3%
  • The percent agreement of passing the exam 1 and not passing the exam 2 is 10/85 = 0.117, so 11.7%.

    通过考试1和未通过考试2的百分比协议是10/85 = 0.117,所以11.7%。

The problem with the percent agreement is that the data can be obtained only by chance.

百分比一致性的问题在于只能偶然获得数据。

科恩的卡帕 (Cohen’s Kappa)

Image for post
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值