R手册(Import)--for csv, xlsx and json


R语言导入常用文件的包

  • readr: 可读取文本型文件
  • readxl: 读取excel文件
  • openxlsx: 可读取、写入和编辑excel文件
  • jsonlite: 读取json文件

readr: Read Rectangular Text Data

Save Data

保存类型函数
逗号分隔文件
base::write.csv()
write_csv(x, path, na = “NA”, append = FALSE,col_names = !append)
任意符号分隔write_delim(x, path, delim = " ", na = “NA”,append = FALSE, col_names = !append)
CSV for excelwrite_excel_csv(x, path, na = “NA”, append =FALSE, col_names = !append)
String to filewrite_file(x, path, append = FALSE)
String vector to file, one element per linewrite_lines(x,path, na = “NA”, append = FALSE)
RDS文件write_rds(x, path, compress = c(“none”, “gz”,“bz2”,“xz”), …)
Tab分隔write_tsv(x, path, na = “NA”, append = FALSE,col_names = !append)

Read Tabular Data

read_*(file, col_names = TRUE, col_types = NULL, locale = default_locale(), na = c("", "NA"),
  quoted_na = TRUE, comment = "", trim_ws = TRUE, skip = 0, n_max = Inf, guess_max = min(1000,n_max), 
  progress = interactive())
# read_* 函数共享这些参数

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-EwdSKWKc-1571388720022)(/images/readr.png)]

规整类型函数(示例)
Comma Delimited Files(逗号分隔)read_csv(“file.csv”)
Semi-colon Delimited Files(分号分隔)read_csv2(“file2.csv”)
Files with Any Delimiter(任意分隔符)read_delim(“file.txt”, delim = “|”)
Fixed Width Files(固定宽度)read_fwf(“file.fwf”, col_positions = c(1, 3, 5))
Tab Delimited Files(Tab分隔符\t)read_tsv(“file.tsv”) Also read_table().

常用参数
col_names: logic or string vector, 列名
skip: int, 跳过第n行
na: 指定用于表示文件中缺失值的值
comment = “#”: 删除以#开头的所有行
progress = show_progress(): 进度条:(“none”/“text”/“tk”/“win”)

write_file("a,b,c\n1,2,3\n4,5,NA","file.csv")
f <- "file.csv"           #Example file

read_csv(f, col_names = FALSE)     #No header
read_csv(f, col_names = c("x", "y", "z"))   #Provide header
read_csv(f, skip = 1)      #Skip lines
read_csv(f, n_max = 1)     #Read in a subset
read_csv(f, na = c("1", "."))  #Missing Values

Read Non-Tabular Data

非规整类型函数
read_file(file, locale = default_locale())读入单独的字符串中
read_lines(file, skip = 0, n_max = -1L, na = character(),
locale = default_locale(), progress = interactive())
每行读入一个字符串
read_file_raw(file)读入原始向量中
read_lines_raw(file, skip = 0, n_max = -1L,progress = interactive())每行读入一个原始向量
read_log(file, col_names = FALSE, col_types = NULL,
skip = 0, n_max = -1, progress = interactive())
读取Apache日志文件

Data types

readr 函数猜测每一列适合的数据类型并且转化,但不会将字符串类型转化为factor
col_types参数对列数据类型的解析(示例):

col_types=cols(
  age = col_integer(),
  sex = col_character(),
  earn = col_double()
)
  1. problems()函数去判断
    x <- read_csv("file.csv"); problems(x)

  2. 可以col_ function函数引导解析,也可以字符格式读入,然后用parse_ function 函数解析

col_ functionparse_ function
col_guess() the defaultparse_guess()
col_character()parse_character()
col_double(), col_euro_double()parse_double()
col_datetime(format = “”)
col_date(format = “”), col_time(format = “”)
parse_datetime()
parse_date() and parse_time()
col_factor(levels, ordered = FALSE)parse_factor()
col_integer()parse_integer()
col_logical()parse_logical()
col_number(), col_numeric()parse_number()
col_skip()不解析
x <- read_csv("file.csv", col_types = cols(
  A = col_double(),
  B = col_logical(),
  C = col_factor()))

x$A <- parse_number(x$A)

readxl: for excel

# 读取excel文件
read_excel(path, sheet = NULL, range = NULL, col_names = TRUE,
  col_types = NULL, na = "", trim_ws = TRUE, skip = 0, n_max = Inf,
  guess_max = min(1000, n_max))
read_xls()
read_xlsx()

# 列出excel文件中的所有sheet
excel_sheets(path) 

读取单元格的指定范围:以A1:D10样式

cellranger::cell_limits()
cellranger::cell_rows()
cellranger::cell_cols()
cellranger::anchored()

read_excel(path, range = "B3:D6")
read_excel(path, range = cell_rows(3:6))
read_excel(path, range = cell_cols("C:D"))
read_excel(path, range = cell_cols(2))
read_excel(path, range = cell_limits(c(4, 3), c(NA, NA)))

openxlsx: xlsx reading, writing and editing

openxlsx包括读写及编辑excel文件,集成极为丰富的函数。

read.xlsx(xlsxFile, sheet = 1, startRow = 1, colNames = TRUE,
  rowNames = FALSE, detectDates = FALSE, skipEmptyRows = TRUE,
  skipEmptyCols = TRUE, rows = NULL, cols = NULL, check.names = FALSE,
  namedRegion = NULL, na.strings = "NA", fillMergedCells = FALSE)`

write.xlsx(x, file, asTable = FALSE, ...)

jsonlite: for json

  1. 读入/写出JSON
    read_json(path, simplifyVector = FALSE, ...)
    write_json(x, path, ...)
  2. JSON和R对象之间转换
    fromJSON(txt)
    toJSON(x)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值