3天入门R

写在前面:这篇文章主要归纳了R的基本数据类型和语法,如题目所示,3天入门R不是梦,只需要有Python和Java的基础就行啦
 
#因为参考的是英文资料,就用英文整理的笔记,如造成阅读不方便,实在抱歉!
 
 
Data Type:
  • scalars: one value
  • vectorsa set of scalars arranged in a one-dimensional array; data values are all same mode; 
    • c() // enter at the same time
    • scan() // enter one by one; stop: leave an entry blank
    • Vectors of same lengths can add up together, element-wise
    • Warning ! short vector add long vector, first element of short vector is append to end to make two vectors same length, then second .... 
    • combine two vectors: c(v1, v2)
    • vectors with same entries : rep(entry, times)
rep(3,7)
[1] 3 3 3 3 3 3 3
  • factors: a special type of character vector; 
    • create a char vector first then convert into factor
settings <- c("high", "medium", "low")
settings <- factor(settings)
# levels: high low medium ----> values of categorical variables
  • matrices: collections of data values in two dimensions;
    •  declare: matrix(), takes arguments as a data vector and specification parameters, # of rows and cols
    • e.g. matrix(c(2,3,4,5)), nrow = 2, ncol = 2) or matrix with all 1 matrix(1, nrow = 2, ncol = 3)
    • if matrix needs more num to fullfill, then cycle the num already have from begining
  • arrays: a matrix with more than two dimensions;
  • lists: contain data with diff modes and encompass other data objects;
    • list()
    • e.g. list(5, 6, "seven", T)
    • List values are indexed with [[ ]]
  • data frames: like a spreadsheet;
    • each col is a vector; within each col, data elements must be the same mode;
    • diff vectors have diff modes;
    • all vectors must be same length
    • To create a data frame object, first create vectors that make up the data frame, e.g. v1, v2, v3 (equal length)
    • data.frame( col_name = v1, ...)
properties: type, value & data mode, variable name
Data Modes:
  • character: single o double quotations;
  • numeric
  • logical T/F
R-Syntax:
ctrl + shift+c: comment on or out
headers: 
  • # level1 , ## level2 .etc
  • followed at least 4 #, =, -
 
Command 
Syntax
Notes
 
Command 
Syntax
Notes
Run script in console
ctrl + return
 
 
Vector
 
 
check vector
(2,4,5,6)
("s","d","e")
(T, F, T)
is.vector(var_name)
single dimension
 
even a single value is a vector
打印数字
 
Use function
a:b
 
seq(10) 
seq(30,0,-3)
either increase or decrease
1 to 10
decrease with step
 
Matrix
matrix(c(2,3,4,5,6,7),
nrow = 2)
2 4 6
3 5 7
default: one go 1st row and next go second row
Assignment
Multiple assign
a <- 2
a <- b <- 2 
a gets 2
 
 
byrow = T
c(2,3,4,5,6,7), nrow = 2, byrow = T
2 3 4
5 6 7
Assign multi-value
x <- c(1,2,3,4)
c = combine/concatenate
 
Array
array(
c(1:24), c(4, 3, 2))
give data
dimension
(row, col, tables)
Auto print 
(x <- 3)
don't need extra x
 
Data Frame
cbind()
var_name  var_name
value          value
auto change double into char, even T,F
log
log(4)
log10(100)
base e
base 10
 
 
as.data.frame(cbind())
remains diff data type
Numeric
默认类型 double
double precision by default
 
List
list()
can have mix data type
list will show index of each element
Check type
typeof(var_name)
mode() 
 
 
same result as above
 
Coerce 
强制改变类型
s <- c(1, "b, T)
o <- as.integer(5)
s1 <- as.numeric(c("1", "2"))
all become char
typeof(o) --> integer
 
typeof(s1) --> double
character
"c"
"a long char"
no string in R
 
Coerce matrix to data frame
matrix(1:9, nrow= 3)
Logical
TRUE/FALSE
赋值时可以T/F
 
 
as.data.frame(matrix(1:9, nrow= 3))
Clear
environment
rm(list = ls())
 
 
Clear Packages
p_unload(all)
detach("package:datasets", unload = T)
 
for base
Clear plots
dev.off()
only if there is a plot
 
Clear Console
cat("\014")
ctrl + L
load package:
pacman::p_load(pacman, xx, xx, xx)
 
Pipes: break down complex syntax and make it easy to follow " %>% "
 

 

 
Import data from csv doc or excel doc:
Graph:
bargraph() 
bargraph(x, col="col_name")
  • index (red3, grey7)
  • index number
bargraph(x, col= colors() [507])
  • RGB (0-1) or (0-255)
  • hexcode
 

Working with Vectors:
  1. Address specific element: variable_name[index]
  2. Select a range: variable_name[start:end] inclusive both sides
  3. Overwrite: e.g. z[3] <- 7
  4. Sort data from small to large: sort()
  5. Order of element: order() // according to the order after sorting, not original order.
z <- c(2,3,6,7,3,4)
order(z)
[1] 1 2 5 6 3 4 // sort(z) --> 2, 3, 3, 4, 6, 7
//order generates index of sorted vector
 
  1. Extract subsets of data from vectors: 
  • directly identify specifc elements and assign to new variable
z3 <- z[c(2, 3)] // elements with index 2 and 3
  • create a logical criterion to select certain elements
z3 <- z[z>100]
  1. Get length of vector: length()
 

Working with Data Frames:
  1. Address specific col/ vector: data_frame_name $vector_name
  2. Address specific element of a data frame: data_frame_name $vector_name[index]
           e.g. xy$x[2]
  1. Add col or row to a data frame:
    • add row: rbind() e.g. df <- rbind(xy, w) // work for matrices as well
    • add col: cbind(), e.g. xyz <- cbind(xy, z)
  2. Checking and Changing Types:
    • " is.what"
    • Check data object: is.vector(), is.data.frame()
    • Check data mode: is.character(), is.numeric()
    • " as.what", assign new data object to a variable
    • Change data object: e.g. y <- as.matrix(x)
    • Change data mode: e.g. numerical to char z <- as.character(x)
    • !!! illogical conversion, such as convert char to num, convert to NA values. 
               
 

Missing Data
  • Using NA value
  • Computations performed on NA, NA carries to the reuslt. e.g. NA * 2 = NA
  • Check is a NA or not: is.na()
 

Listing and Deleting objects in Memory:
 
  • List current objects in current workspace memory
    • ls()
    • objects()
  • Remove specific object
    • rm(object1, object2,...)
 
Data Edit:
  • data.entry() ---> pop up a spreadsheet like table
 
Save Work:
  • save to file: save everything (command, output, etc)
  • savehistory: save commands and objects
    • format: *.Rhistory file
    • savehistory(file = "fileName.Rhistory")
    • including the command "savehistory()"
  • save image: save objects only
    • save.image() ---> create a .RData file 
    • save.image(filename) ---> create a file with name
    • create a R workspace file
    • Load a previously saved workspace: load("directory")
 

Importing Files:
 
Read functions
  • read.table();
    • read in a flat file data file, ASCII text format.
    • Arguments:
      • file name, header=T,
      • fileEncoding="xxx" (optional)
      • row.names = xx (optional)or col.names = "xxx"
    • data frame object
    • separator:
      • sep = " ", sep = "\t" tab-delimited file
  • read.csv()
    • read comma delineated spreadsheet file data
  • scan()
    • used with an argument of a file name
    • import files of diff types
  • read.xlsx()
    • use library(xlsx)
    • argument: file_name, sheetName="xx", header = T
 

Get Help:
  • help(xx)
  • ?xxx
  • apropos() : don't know exact name of what you're looking for
 

R - Programming:
Syntax
  • Semicolons separate statements: x <- 5; y <- 7
  • Comment: use #
  • Case sensitive
Arithmetic Operators:
  • + - * / ^
Logical & Relational Operators:
  • flow control
    • order
    • selcetion
    • repeting
 
Conditional Operators:
  • if (condition is true)
                  then do this
  • can become one line 
if (x<=y) z <- x+y
if(q<t) {w <- q+t} else w<-q-t #{} is not mandatory but good to have! don't use ()--> print w directly!
  • {} curly bracket sets are frequently used to block sections of code; 
  • indicate code continues on the next line
        
 
Looping:
  • While
  • For
  • For loops can be nested.
  • in matrix, don't need to define data mode

Subsetting with Logical Operators:
  • use outcomes of a logical vector statement for subsettig a vector
  • Only elements where outcomes are equals True will be selected
 

Functions:
 
ceiling(5.4) --> 6
 
Writing Functions:
  • Address a col from a data.frame: data frame_name $ col_name
# e.g.
add_two_num <- function(num1, num2) {
    num1 + num2}

Data Summary Functions in R:
  • summary() Function: 
    • present descriptive statistics: like min, 1st Qu...
 
Graphics:
  1. High level plotting functions
    • plot()
      • with single argument, like plot(x)
      • values in y-axis; indices of value on x-axis
    • xlim=range(a:b)
    • main="name" (main title)
  2. Low-level plotting functions
    • Add additional information to an existing plot (such as lines)
    • title(main="xxx") OR title("xxxx") similar as arguments of high-level plotting functions
    • text(x, y, label="xxxx")
    • lines(x,y)
    • !! working with multiple plots, low-level plotting function used to apply to most recently added plot
  3. Graphical parameter functions
    1. control graphics window
    2. fine-tune appearance of graphics with color, text and fonts
    3. par():
      1. split graphics screen to display more than one plot on graphic device at one time
      2. use mfrow or mfcol parameters of par function
      3. mfrow: draw plots in row order (row1, col 1; row1, col 2) horizontal display
      4. mfcol: draw plots in col order (row 1, col 1; row 2, col 1) vertical display
 
                                        High-Level
 
                                        Lower-Level
                                       Graphical Parameters
  • Histogram
    • hist()
    • argument, las = 1(顺时针90度): rotate labels on y-axis 
    • las can take value (0, 1, 2, 3)
      • 0: label parallel to axis(default)
      • 1: horzontally
      • 2: perpendicular 垂直 to axis
      • 3: placed vertically
    • breaks=c(num1, num2...)
    • breaks=c(seq(begin_num, end_num, step)
                     
 
 
 
 
 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值