r语言 read_html,R软件中读入纯文本文件的方法read.table()和scan()函数

read.table是R中用于读取表格格式文件的函数,它从文件中创建数据框。可以指定header、sep、quote、dec等参数来定制读取方式。此外,read_html函数用于解析HTML内容。这两个函数提供了灵活的数据输入选项,适用于处理不同格式的文本数据。
摘要由CSDN通过智能技术生成

read.table {utils}

R Documentation

Data Input

Description

Reads a file in table format and creates a data frame from it,

with cases corresponding to lines and variables to fields in the

file.

Usage

read.table(file, header = FALSE, sep = "", quote = "\"'",

dec = ".", row.names, col.names,

as.is = !stringsAsFactors,

na.strings = "NA", colClasses = NA, nrows = -1,

skip = 0, check.names = TRUE, fill = !blank.lines.skip,

strip.white = FALSE, blank.lines.skip = TRUE,

comment.char = "#",

allowEscapes = FALSE, flush = FALSE,

stringsAsFactors = default.stringsAsFactors(),

fileEncoding = "", encoding = "unknown")

read.csv(file, header = TRUE, sep = ",", quote="\"", dec=".",

fill = TRUE, comment.char="", ...)

read.csv2(file, header = TRUE, sep = ";", quote="\"", dec=",",

fill = TRUE, comment.char="", ...)

read.delim(file, header = TRUE, sep = "\t", quote="\"", dec=".",

fill = TRUE, comment.char="", ...)

read.delim2(file, header = TRUE, sep = "\t", quote="\"", dec=",",

fill = TRUE, comment.char="", ...)

Arguments

file

the name of the file which the data are to be read from. Each

row of the table appears as one line of the file. If it does not

contain an absolute path, the file name is

relative to the current working directory, R 2.10.0 this

can be a compressed file (see

Alternatively, file can be a readable text-mode

Ctrl-D on

Unix and Ctrl-Z on Windows. Any pushback on

stdin() will be cleared before return.)

file can also be a complete URL.

header

a logical value indicating whether the file contains the names

of the variables as its first line. If missing, the value is

determined from the file format: header is set to

TRUE if and only if the first row contains one fewer

field than the number of columns.

sep

the field separator character. Values on each line of the file

are separated by this character. If sep = "" (the

default for read.table) the separator is ‘white

space’, that is one or more spaces, tabs, newlines or carriage

returns.

quote

the set of quoting characters. To disable quoting altogether,

use quote = "". See colClasses is specified.

dec

the character used in the file for decimal points.

row.names

a vector of row names. This can be a vector giving the actual

row names, or a single number giving the column of the table which

contains the row names, or character string giving the name of the

table column containing the row names.

If there is a header and the first row contains one fewer field

than the number of columns, the first column in the input is used

for the row names. Otherwise if row.names is missing,

the rows are numbered.

Using row.names = NULL forces row numbering.

Missing or NULL row.names generate row

names that are considered to be ‘automatic’ (and not preserved by

col.names

a vector of optional names for the variables. The default is to

use "V" followed by the column number.

as.is

the default behavior of read.table is to convert

character variables (which are not converted to logical, numeric or

complex) to factors. The variable as.is controls the

conversion of columns not otherwise specified by

colClasses. Its value is either a vector of logicals

(values are recycled if necessary), or a vector of numeric or

character indices which specify which columns should not be

converted to factors.

Note: to suppress all conversions including those of numeric

columns, set colClasses = "character".

Note that as.is is specified per column (not per

variable) and so includes the column of row names (if any) and any

columns to be skipped.

na.strings

a character vector of strings which are to be interpreted as

colClasses

character. A vector of classes to be assumed for the columns.

Recycled as necessary, or if the character vector is named,

unspecified values ar

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值