我们使用大小为2.2G的一个CSV文件。
一、data.table包下的fread(各种帖子均说此方法最快,我们来比较下)
library(data.table)
start <- Sys.time()
dtc <- fread("C:/Users/10530/Desktop/DTc/DtcDrugTargetInteractions.csv", sep = ",", stringsAsFactors = F, na.strings = "", data.table = T)
end <- Sys.time()
print(end-start)
fread还会有读取进度条, 运行时间1.423824 分
二、read.方法(这里我们以read.csv函数为例)
start <- Sys.time()
dtc <- read.csv("C:/Users/10530/Desktop/DTc/DtcDrugTargetInteractions.csv", sep = ",", stringsAsFactors = F, na.strings = "")
end <- Sys.time()
print(end-start)
运行时间2.003878 分
三、有人提到,保存为Rdata之后,在再次导入速度会快很多,我们试试
save(dtc, file = "dtc.Rdata")
start <- Sys.time()
load("dtc.Rdata")
end <- Sys.time()
print(end-start)
运行时间6.042526 秒