R - Data structure

R语言著名开发者 Hadley Wickham 的 Advanced R 的学习笔记1

原书地址 https://adv-r.hadley.nz/index.html

 
Data structure
 
 

Vectors

Vectors come in two flavours: atomic vectors and lists.
All elements of an atomic vector must be the same type, whereas the elements of a list can have different types.
 
Three common properties:
  • Type, typeof(), what it is.
  • Length, length(), how many elements it contains.
  • Attributes, attributes(), additional arbitrary metadata.
 
Atomic vectors
Four common types: logical, intege(numeric), character.
Test types: is.double( ), is.atomic( ), is.numeric( ) , is.integer( ) ...
Coercion: Combing a character and an integer yield a character.
 
Lists
c( ) will combine several lists into one, while list( ) will not.
> c(list(1, 2), c(3:5))
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 4
[[5]]
[1] 5
 
> b = list(list(1, 2), c(3:5))
> b
[[1]]
[[1]][[1]]
[1] 1
[[1]][[2]]
[1] 2
[[2]]
[1] 3 4 5
> str(b)
List of 2
$ :List of 2
  ..$ : num 1
  ..$ : num 2
$ : int [1:3] 3 4 5
 
Attributes
For atomic vectors:
Attrubutes are used to store metadata about the object. Attributes can be thought of as a named list (with unique names). Attributes can be accessed individually with attr() or all at once (as a list) with attributes().
 
By default, most attributes are lost when modifying a vector.
The only attributes not lost are the three most important: 
  • Names, a character vector giving each element a name, described in names.
  • Dimensions, used to turn vectors into matrices and arrays, described in matrices and arrays.
  • Class, used to implement the S3 object system, described in S3.
 
You can name a vector in three ways:
  • When creating it: x <- c(a = 1, b = 2, c = 3).
  • By modifying an existing vector in place: x <- 1:3; names(x) <- c("a", "b", "c").
Or: x <- 1:3; names(x)[[1]] <- c("a").
  • By creating a modified copy of a vector: x <- setNames(1:3, c("a", "b", "c")).
 
You can create a new vector without names using unname(x), or remove names in place with names(x) <- NULL
 
Factors
Factors are build on top of integer vectors using two attributes: the class, the levels.
> x <- factor(c("a", "b", "b", "a"))
> typeof(x)
[1] "integer"
> class(x)
[1] "factor"
Though factors are a type of integer vectors, functions like is.integer( ) , is.numeric( ), etc, seem to test specific type (class ?).
> is.integer(x)
[1] FALSE
> is.atomic(x)
[1] TRUE
factor = integer + 2 attributes.
Factor is a type of integer but is no longer integer. Factors are still atomic vectors.
 
> f1 <- factor(c("a","b","c","d"))
> f1
[1] a b c d
Levels: a b c d
> f1 %>% as.integer
[1] 1 2 3 4
 
# When we only change levels, the integers behind factors remain unchanged!
> levels(f1) %<>% rev()
> f1
[1] d c b a
Levels: d c b a
> f1 %>% as.integer()
[1] 1 2 3 4
 
Many data loading functions will convert char to factors automatically, so use argument : stringAsFactor = FALSE.
 
Matrices and arrays
atomic vector  + dim atrribute = multi-dimensional array.
2 dims array ---->  metrices.
 
 
Matrices
Arrays
length()
nrow(), ncol()
dim()
names()
rownames(),
colnames()
dimnames()
c()
cbind(), rbind()
abind::abind
 
t()
sperm()
 
is.matrix()
as.matrix()
is.array()
as.array()
 
Data frames
 a list of  equal-length vectors.
typeof( data frame ) is "list".
class( data frame) is "data.frame".
 
Creating a data frame:
df <- data.frame(  ## This function turns string as factors defaultly.
x = 1:3,
y = c("a", "b", "c"),
stringsAsFactors = FALSE)
 
When combing column-wise,  the number of rows must match, but row names are ignored.
When combing row-wise,       the number and names of columns must match.
plyr::rbind.fill() to combine data frames that don’t have the same columns.
 
I( ) treat the list( or matrix etc.) as one unit not several columns
> df1 = data.frame("x" = 1:3,
+                  "y" = list(4:6,7:9,10:12))
> df1
  x y.4.6 y.7.9 y.10.12
1 1     4     7      10
2 2     5     8      11
3 3     6     9      12
 
> df1 = data.frame("x" = 1:3,
+                  "y" = I(list(4:6,7:9,10:12)))
> df1
  x          y
1 1    4, 5, 6
2 2    7, 8, 9
3 3 10, 11, 12
 

转载于:https://www.cnblogs.com/JMJM-Li/p/8482029.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值