【R语言--dcast--errors】Aggregation function missing: defaulting to length; 转化后的值变成了0,1
详情参见微信公众号【 阿呆ForFun】
https://mp.weixin.qq.com/s/6TSjndPie_RSwyzidYyXyA
dcast函数
dcast函数是 reshape2包中的一个常用函数,主要功能是将长格式数据转换为宽格式数据(Long- to wide-format data),类似Excel的数据透视表的操作。
Usage
dcast(
data,
formula,
fun.aggregate = NULL,
sep = "_",
...,
margins = NULL,
subset = NULL,
fill = NULL,
drop = TRUE,
value.var = guess(data),
verbose = getOption("datatable.verbose")
)
说明
An illustration of the dcast function
图片来源:https://seananderson.ca/2013/10/19/reshape/
Bugs:
利用dcast函数做数据转化时,出现如下提示,并且转化后的值变成了0,1
Aggregation function missing: defaulting to length
举例说明:
# 原始数据
head(data)
student test score
Adam Exam1 80
Adam Exam2 90
John Exam1 70
John Exam2 60
dcast(data,student~test,value.var='score')
# 期望利用dcast 将数据转化
Student Exam1 Exam2
Adam 0 0
John 0 1
所有的值都变成了 0,1;还给出如下提示
Aggregation function missing: defaulting to length
Solutions:
很大可能是原始数据有重复,例如
student test score
Adam Exam1 80
Adam Exam1 85
Adam Exam2 90
John Exam1 70
John Exam2 60
检查数据是否有重复,如下(检查第一列,第二列是否有重复的行)
dupli<-data[duplicated(data[,1:2]),]
如果有重复,去掉重复,或者增加一个聚合函数 fun.aggregate。
因为默认的聚合函数是 length,所以会出现所有值变成 0,1的情况。
可以设置 fun.aggregate = [sum| mean| sd ]等。
例如:
data1<-dcast(data,var1~var2,value.var = "var3",fun.aggregate = mean)
参考:
(1)https://www.codenong.com/33051386/
(2)https://stackoverflow.com/questions/30463591/r-reshape2-aggregation-function-missing-defaulting-to-length/34882664
(3) https://stackoverflow.com/questions/6986657/find-duplicated-rows-based-on-2-columns-in-data-frame-in-r
(4) https://www.rdocumentation.org/packages/maditr/versions/0.7.4/topics/dcast