Datacamp: Introduction to Importing Data in R

Miyuki酱

已于 2022-08-23 12:29:14 修改

阅读量270

点赞数

分类专栏： Datacamp自学笔记文章标签： r语言

于 2022-08-17 23:12:53 首次发布

本文链接：https://blog.csdn.net/weixin_51825567/article/details/126380986

版权

Chapter 1

1. Introduction & read.csv

Defaults: (1) header = TRUE

(2) sep = ","

eg. (1) Use read.csv() to import "swimming_pools.csv" as a data frame with the name pools.(2) Print the structure of pools using str().

# Import swimming_pools.csv: pools
pools <- read.csv("swimming_pools.csv")

# Print the structure of pools
str(pools)

stringsAsFactors: If seting stringsAsFactors to FALSE, the data frame columns corresponding to strings in your text file will be character. If seting stringsAsFactors to TRUE, you import strings as factors.

# Import swimming_pools.csv correctly: pools
pools <- read.csv("swimming_pools.csv", stringsAsFactors = FALSE)

# Check the structure of pools
str(pools)

2. read.delim & read.table

(1) read.delim (读取.txt文件)

Default: (1) header = TRUE (the first row contains the field names).

(2) sep = "\t" (fields in a record are delimited by tabs)

eg1. (1) Import the data in "hotdogs.txt" with read.delim(). Call the resulting data frame hotdogs and the variable names are not on the first line. (2) Call summary() on hotdogs. This will print out some summary statistics about all variables in the data frame.

# Import hotdogs.txt: hotdogs
hotdogs <- read.delim("hotdogs.txt", header = FALSE)

# Summarize hotdogs
summary(hotdogs)

(2) read.table (deal with more exotic flat file formats)

By default, the header argument defaults to FALSE and the sep argument is "" by default.

eg2. (1) The data is still hotdogs.txt. It has no column names in the first row, and the field separators are tabs. This time, though, the file is in the data folder inside your current working directory. A variable path with the location of this file is already coded for you. (2) Call head() on hotdogs; this will print the first 6 observations in the data frame.