1. 关闭绘图窗口
dev.off()
2. 删除已经绘制的图形元素
无法直接删除,只能通过其他方法遮盖,比如,把颜色设置为背景色。
例如:
plot(1,1,type="n");
text(1,1,"pinggu",col="green");
#remove
text(1,1,"pinggu",col="white");
3. 矩阵的列转化成factor
> colApply <- function(dat, cols = colnames(dat), func = as.factor) {
dat[cols] <- lapply(dat[cols], func)
return(dat)
}
> dat <- data.frame(x1 = rpois(20, 3), x2 = 1:20, x3 = sample(letters[1:4], 20, replace = T), stringsAsFactors = F)
> dat
x1 x2 x3
1 3 1 d
2 8 2 d
3 2 3 b
4 1 4 b
5 3 5 c
6 8 6 a
7 1 7 a
8 2 8 a
9 4 9 c
10 0 10 a
11 1 11 d
12 4 12 c
13 1 13 b
14 6 14 c
15 3 15 b
16 3 16 c
17 4 17 d
18 4 18 c
19 6 19 b
20 1 20 d
> str(colApply(dat, 'x1')) # x1 is a factor now
'data.frame': 20 obs. of 3 variables:
x1:Factorw/7levels"0","1","2","3",..:4732472351... x2: int 1 2 3 4 5 6 7 8 9 10 ...
x3:chr "d""d""b""b"...>str(colApply(dat,2,cumsum))′data.frame′: 20obs.of 3variables:x1: num 3 8 2 1 3 8 1 2 4 0 ...
x2:int 13610152128364555... x3: chr "d" "d" "b" "b" ...
> str(colApply(dat, c(1, 3), as.factor)) # x1 and x3 are factors while x2 is still integers
'data.frame': 20 obs. of 3 variables:
x1:Factorw/7levels"0","1","2","3",..:4732472351... x2: int 1 2 3 4 5 6 7 8 9 10 ...
$ x3: Factor w/ 4 levels "a","b","c","d": 4 4 2 2 3 1 1 1 3 1 ...
4. 打开EXCEL文件
安装RODBC包,然后:
library(RODBC)
odbcConnectExcel("路径\\文件名.xls")
5. 重新命名data.frame的行
rownames(x) <- 1:100
6. 求年份是否是闰年的函数
library(lubridate)
a<-function(x){
date<-as.POSIXct(x,format = "%Y-%m-%d", tz = "UTC")
if(as.numeric(format(date, "%Y"))%%400==0) print("YES, Leap year.")
else if(as.numeric(format(date, "%Y"))%%100==0) print("NO.")
else if(as.numeric(format(date, "%Y"))%%4==0) print ("YES, Leap year.")
else print("NO.")
}
a("2013-8-1")
#[1] "NO."
7. 用mean计算截断均值
x <- c(0:10, 50)
xm <- mean(x)
c(xm, mean(x, trim = 0.10))
trim就是截断,表示去掉最高和最低的10%。
8. 用ecdf求经验分布函数
c(0.01,0.02,0.23,0.24,0.35,0.45,0.67,0.89,0.90),用ecdf(x),可以得到经验分布函数。
9. 多元正态性检验
在mvnormtest包里,mshapiro.test函数
10. 做向量自回归(VAR)模型
用vars和urca包。
11. robust回归
使用robust和robustbase包。示例代码:
library(robustbase)
# method="Mqle" fits a generalized linear model using Mallows or Huber type robust estimators, as described in Cantoni and Ronchetti (2001).
glmrob0<-glmrob(cbind(NumDeath, NumTested-NumDeath)~log(Conc),family=binomial,data=set6,method="Mqle")
library(robust)
glmrob4<-glmRob(cbind(NumDeath, NumTested-NumDeath)~log(Conc),family=binomial(),data=set6)
12. 调用Matlab
install.packages("R.matlab")
library(R.matlab)
path <- system.file("mat-files", package="R.matlab")
mat <- readMat(file.path(path, "structLooped.mat"))
s <- mat$s
fields <- dimnames(s)[[1]]
cat("Field names: ", paste(fields, collapse=", "), "\n", sep="");
print(s)
13. 绘制圆形图
以下示例,仅使用graphics包绘制:
# Data for plotting
head(dfPlt)
# mnt hrl prv
# 北京 1650 16.9 <U+5317><U+4EAC>
# 上海 1620 14.0 <U+4E0A><U+6D77>
# 广东 1550 15.0 <U+5E7F><U+4E1C>
# 新疆 1520 15.2 <U+65B0><U+7586>
# 天津 1500 15.0 <U+5929><U+6D25>
# 江苏 1480 13.0 <U+6C5F><U+82CF>
# Function to add a fan-shape with text inside and outside inthe plot
pltFan <-
function(cnt.arc, arc, r.in, r.out, text.in =character(0), text.out = character(0), text.in.cex = 0.5, text.out.cex = text.in.cex,...)
{
# Fan-shape
polygon(c(r.in*cos(cnt.arc - arc/2),r.in*cos(cnt.arc + arc/2), r.out*cos(cnt.arc + arc/2), r.out*cos(cnt.arc -arc/2)), c(r.in*sin(cnt.arc - arc/2), r.in*sin(cnt.arc + arc/2),r.out*sin(cnt.arc + arc/2), r.out*sin(cnt.arc - arc/2)), ...)
# Text inside
text((r.in - par()cxy[2]∗text.in.cex)∗cos(cnt.arc),(r.in−par()cxy[2]*text.in.cex)*sin(cnt.arc),text.in, adj = c(0.5, 0.5), srt = cnt.arc/2/pi*360 - 90, cex = text.in.cex)
# Text outside
text((r.out + par()cxy[2]∗text.out.cex)∗cos(cnt.arc),(r.out+par()cxy[2]*text.out.cex)*sin(cnt.arc),text.out, adj = c(0.5, 0.5), srt = cnt.arc/2/pi*360 - 90, cex = text.out.cex)
}
# Number of records to plot
iNbar <- nrow(dfPlt)
# Optionally, map additional font - Windows only
windowsFonts(KT = windowsFont("KaiTi"))
# Plot in .png format
png("salary.png", width = 960, height = 960)
# Maximizing the plot area
par(mar = c(0, 0, 0, 0))
# Call plot.new with a few settings
plot(c(-2, 2), c(-2, 2), type = "n", axes = F, xlab ="", ylab = "", asp = 1, xlim = c(-2, 2), ylim = c(-2, 2))
# Do the plot
sapply( 0:(iNbar - 1), function(i)
{
# Text inside
pltFan(cnt.arc = pi/2 - (i*2*pi/(iNbar + 1) +2*pi/(iNbar + 1)*(1/4 + 1/8 + 1/4)/2), arc = 2*pi/(iNbar + 1)*(1/4 + 1/8 +1/4), r.in = 1, r.out = 2, text.in = dfPlt[i + 1, "prv"], text.out =NULL, col = NULL, border = NA, text.in.cex = 1)
# Monthly data
pltFan(cnt.arc = pi/2 - (i*2*pi/(iNbar + 1) +2*pi/(iNbar + 1)/8), arc = 2*pi/(iNbar + 1)/4, r.in = 1, r.out =dfPlt[i + 1, "mnt"]/1000, text.in = NULL, text.out =as.character(dfPlt[i + 1, "mnt"]), text.out.cex = 1, col ="orange", border = NA)
# Hourly data
pltFan(cnt.arc = pi/2 - (i*2*pi/(iNbar + 1) +2*pi/(iNbar + 1)*(2 + 1 + 1)/8), arc = 2*pi/(iNbar + 1)/4, r.in = 1,r.out = dfPlt[i + 1, "hrl"]/20 + 1, text.out.cex = 1, text.in = NULL,text.out = as.character(dfPlt[i + 1, "hrl"]), col ="yellow", border = NA)
} )
# Legend and other texts in the plot
legend(2, -1.5, c("月最低工资标准","小时最低工资标准"), col = c("orange","yellow"), bty = "n", cex = 1.25, pt.cex = 1.2, pch = c(15,15), xjust = 1, yjust = 1) -> lgnd
# If no font mapping is done, remove the family option
text(lgndrectleft, lgndrecttop - lgndrecth, "注:数字为截至2014年\n5月1日的执行标准",adj = c(0, 1), cex = 1.25, family = "KT")
text(-2.1, 1.75, "全国各省(区、直辖市)\n2014年最低工资标准",cex = 1.75, adj = c(0, 1))
text(-2.1, 1.75 - par()$cxy[2]*4, "(单位:元)",cex = 1.25, adj = c(0, 1))
dev.off()
Result:
14. 循环填充解释
x<-c(1,4,3,6,5,4,9)
x[c(1,3,5)]
#[1] 1 3 5
x[c(TRUE,FALSE)]
#[1] 1 3 5 9
x[c(TRUE,FALSE,FALSE)]
#[1] 1 6 9
x[c(TRUE, FALSE)]里面的长度不够长(没有x的长度大),所以会循环它。其实就是x[TRUE, FALSE, TRUE, FALSE, ...]一直到长度够长为止。
15. 找出两个数组中相同的行的行号
求R的一个函数能够找出两个数据中相同的行
A
1,2
3,4
1,4
1,5
3,5
B
1,2
1,3
1,4
3,5
3,6
希望能够将A,B中相等的
1,2
1,4
3,5
的行号找出来,因为数据量巨大,不能使用循环。
直接使用
A==B