1.当使用cor()求相关系数时,出现以下错误:
当求相关系数时,出现如下错误时:
> cor(pvnum_avgwinprice$V2, pvnum_avgwinprice$V3)
错误于cor(pvnum_avgwinprice$V2, pvnum_avgwinprice$V3) : 'y'必需是数值
选取pvnum_avgwinprice中的前面几行:
C4KDHL8e1Y 5 19.2
C4KLPJE53d 65 457.5
C4NDP8F12E 6 499.6666666666667
C4RJur4Z3S 1 302.75
C53HK14V3L 3 1102.3333333333333
不应该啊,这第三列明明是数值!于是验证下类型:
> typeof(pvnum_avgwinprice$V3)
[1] "integer"
> is.numeric(pvnum_avgwinprice$V3)
[1] FALSE
看来R语言获取的时候将第三列定义为一个非数值类。于是强制转换一下:
> cor(pvnum_avgwinprice$V2, as.double(pvnum_avgwinprice$V3))
或者
> cor(pvnum_avgwinprice$V2, as.numeric(pvnum_avgwinprice$V3))
这样就搞定了!
时隔多日又碰到这一问题,这回问题更是匪夷所思!
> hist(cvr_distr$V2)
Error in hist.default(cvr_distr$V2) : 'x'必需为数值
Error in hist.default(cvr_distr$V2) : 'x'必需为数值
> head(cvr_distr$V2)
[1] 0.007000999999999993 0.0072929959999999934 0.0075849919999999935 0.007876987999999993
[5] 0.008168983999999992 0.008460979999999991
501 Levels: 0.007000999999999993 0.0072929959999999934 0.0075849919999999935 ... 0.15299899999999972
[1] 0.007000999999999993 0.0072929959999999934 0.0075849919999999935 0.007876987999999993
[5] 0.008168983999999992 0.008460979999999991
501 Levels: 0.007000999999999993 0.0072929959999999934 0.0075849919999999935 ... 0.15299899999999972
明明是数值,却报错!!!使用强制转换后的结果却是:
> head(as.double(cvr_distr$V1))
[1] 1 495 488 489 490 491
解决“ 'x'必需为数值”的方法:
http://szypanther.blog.hexun.com/70559965_d.html
as.numeric(as.character(x))
目前暂时不知原因!
求曲线回归系数
> dat <- read.table("cvr.dat")
> dat
V1 V2
1 0.04997310 0.0260000
2 0.04094868 0.0155000
3 0.03326068 0.0105000
4 0.02671426 0.0060000
5 0.02212872 0.0049375
> f=function(x,a,b){a*x+b}
> result = nls(dat$V2~f(dat$V1,a,b), data=dat, start=list(a=1.5,b=0.01))
> result
Nonlinear regression model
model: dat$V2 ~ f(dat$V1, a, b)
data: dat
a b
0.75548 -0.01356
residual sum-of-squares: 1.148e-05
Number of iterations to convergence: 1
Achieved convergence tolerance: 7.426e-07
使用read.table读文件报错
> imp_clk_data <- read.table(file="imp_clk_data", sep='\t')
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
299850行没有8元素
查看了下 明明就有8个元素,非要报错。http://f.dataguru.cn/thread-1897-1-1.html
这里加个选项就没有问题
imp_clk_data <- read.table(file="imp_clk_data", blank.lines.skip=F, sep='\t')