定性数据分析笔记

数据的结构:

  • 名义数据:性别,职业,婚姻状况,宗教信仰 诸如此类
  • 次序数据:学历,职称,医院级别

数值型数据分为两大类

  • 计数数据
  • 计量数据
定性数据描述性分析(“单一类别变量”)
  • 图表法(条形图,一致图(18年刚产生的),等等)
  • 数值计算 (比例,相对风险和优势比)
定性变量的关联性研究(“研究多个类别变量”)
  • 列联分析:两两研究,每次只研究两个变量关系
  • 对应分析:“变量关系”,“次序关系”,“多项式回归(广义的)”
  • 高维列联最好用的图为:“马赛克图”
  • 一致图(提供的信息量更多一点),当研究的对应样本变量之间关系可以用一致图。

前面是图表法,下面是数值法
如果抽取的样本和总体样本结构相同,行列百分可以随便用,否则不能随便用

拟合检验:验证一个东西是否是对的

设置原假设:默认为一般情况,大部分情况,主要情况,极少出现作为备则假设

对应分析和回归分析

相关系数:
$\frac{\sum{(x_i-xjz)(yi-yjz)}}{\delta_x*\delta_y}$
在这里插入图片描述

1.组成分分析
2.典型相关分析

在这里插入图片描述
在这里插入图片描述
对应分析R代码

library(MASS)
data("Suicide")
A1<-xtabs(Freq~age.group+method2,subset=sex=="female",data=Suicide)
A2<-xtabs(Freq~+age.group+method2,subset=sex=="male",data=Suicide)
AA1<-matrix(A1,ncol=8)
AA2<-matrix(A2,ncol=8)
rownames(AA1)=c("F15","F30","F45","F60","F80")
colnames(AA2)=c("poison","gas","hang","drown","gun","knife","jump","other")
rownames(AA2)=c("M15","M30","M45","M60","M80")
colnames(AA2)=c("poison","gas","hang","drown","gun","knife","jump","other")
A<- rbind(AA1,AA2)

ca<-corresp(A,nf=2)
plot(ca)
abline(v=0,h=0,lty=5)

定性数据的回归分析

什么样可以做回归呢?
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

辛普森驳论

延伸:

==《统计学的世界》==普渡大学

统计是数据收集,统计的科学和艺术

百分条图在统计里面最宠爱

r语言用的两个包

vcd
mass

分享产生价值! A valuable new edition of a standard reference "A 'must-have' book for anyone expecting to do research and/or applications in categorical data analysis." –Statistics in Medicine on Categorical Data Analysis, First Edition The use of statistical methods for categorical data has increased dramatically, particularly for applications in the biomedical and social sciences. Responding to new developments in the field as well as to the needs of a new generation of professionals and students, this new edition of the classic Categorical Data Analysis offers a comprehensive introduction to the most important methods for categorical data analysis. Designed for statisticians and biostatisticians as well as scientists and graduate students practicing statistics, Categorical Data Analysis, Second Edition summarizes the latest methods for univariate and correlated multivariate categorical responses. Readers will find a unified generalized linear models approach that connects logistic regression and Poisson and negative binomial regression for discrete data with normal regression for continuous data. Adding to the value in the new edition is coverage of: Three new chapters on methods for repeated measurement and other forms of clustered categorical data, including marginal models and associated generalized estimating equations (GEE) methods, and mixed models with random effects Stronger emphasis on logistic regression modeling of binary and multicategory data An appendix showing the use of SAS for conducting nearly all analyses in the book Prescriptions for how ordinal variables should be treated differently than nominal variables Discussion of exact small-sample procedures More than 100 analyses of real data sets to illustrate application of the methods, and more than 600 exercises An Instructor's Manual presenting detailed solutions to all the problems in the book is available from the Wiley editorial department.
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值