1. Good coding practice (Formatting and commenting codes)
a. R 代码书写时的基本规范:
问题 | 规范 | Good example | Bad example |
---|---|---|---|
Names | (1)Names should be meaningful and concise (2)File names should always end with .R | format-data.R ; plot-data.R | script01.r ; data.r |
Object names | (1)Underscores_ should be used instead of spaces. (2) Lowercase should be used for variables and function names. (3) Avoid using name of existing R functions. | data_raw ; data_summary | DataRaw ; Data2 |
Line length | (1)Try to limit code to 80 characters. (2) Facilitates reading and printing | ||
Indentation | (1)Always indent code for readability. (2)Align function arguments with spaces. (3) Automatic in Rstudio use ctrl + i on selected lines of code. (Mac: command+i) | ||
Assignment | (1) Use <- to create new object (not = or -> ). (2) Use = for function arguments. | x <- 2 ; sqrt(x = 2) | 2 -> x ; sqrt(x <- 2) |
Spacing | (1)Always use a space after a comma (just like in English). (2)Use spaces around all infix operators (= , + , - , <- , & , etc.). (3)One exception is with : , :: , ::: . | x <- mean(x = c(1, 2, 3)) ; graphics::plot(1:10) | x<-mean(x=c(1,2,3)) ; graphics :: plot(1 : 10) |
Curly braces {} | (1)Opening curly brace { never on its own line. (2) Closing curly brace } always on its own line except when followed by else . | ||
Quotation marks | (1)Strings can be created by using either " " or ' ' . (2)Both can be used in combination 'He said "Hello"' . | ||
Inline comments # | (1) Used to clarify WHY this code was used. (2)Should be used after a short line of code or before a long one. | A lot of code with no comments # | |
Sections | (1)Used to organize the code. (2)Can be added from Rstudio under the tab: Code>Insert Section. (3)The little arrow on the left end allows to fold/unfold the code. | ||
Header | (1)File names should be concise. (2)However, all additional information can be added as a header. | ||
set.seed | Always set a seed (set.seed ) in sampling/simulations |
NB:
(1) 我们习惯于用 <-
来给变量赋值(assignment),用 =
来给函数传参数。
(2) Always use relative file paths.Using relative paths to import/export data/plots helps portability.别人只需要更换working dictionary。
(3)Version control
– Allows to go back to old version and see the history of changes
– Dropbox: easy but privacy issues and limited storage
– Git: steep learning curve but highly used in computer science and integration in Rstudio
b. helping from R
首先要将Rstudio的辅助功能按照图下打开(这里的R&R_studio版本比较老):
然后就可以得到Rstudio的help:
2. Rmarkdown
a. create & export a report
Create
Export a report
b. Structure of Rmarkdown
(1) overall structure
(2) YAML header (helps the document building process)
Example:
Notes for YAML header:
- Uses the format key: value
- Determines
– What type of document to create:output: html_document
– How to create it:
Add a table of contenttoc: true
Use a themetheme: cerulean
- Adds other information to the file:
title
,subtitle
,date
, andauthor
(3) Code chunk (i.e. R, bash, python, etc.)
Example:
Notes:
- Run when the report is being created unless told otherwise (
eval=FALSE
)
– Can be run independently of the document with or
– Code can be anything (data management, plot, table, etc.) - To insert chunks
– Ctrl + Alt + i (OS X: Cmd + alt + i)
– Or click the button:
(4) Code chunk options
[1] Setting:
- Control the behavior of the chunk and its output
- Many chunk options available
- Basic options can also be set by clicking on the “gear” () button in the chunk
Example:
[2] eval, echo & include
The example showed in the table of this part ([2]) are base on the same Rmd file:
Table to show the comparison between eval, echo & include
option | function | example |
---|---|---|
eval = TRUE/FALSE | Should the code in the chunk be run? | Important when giving example of code that would return an error if run |
echo = TRUE/FALSE | Should the code appear in the report? | Can be used to make the report look nicer but loss of reproducibility (code hidden from the reader) |
include = TRUE/FALSE | Should the code AND any output be included to the report | Usually used for setting global options |
[3] Controls the behavior of code output results = 'markup'/'asis'/'hide'/'hold'
Example:
[4] message = TRUE/FALSE
& warning = TRUE/FALSE
Option | function | Example |
---|---|---|
message = TRUE/FALSE | Should output messages be included in the report | |
warning = TRUE/FALSE | Should output warnings be included in the report |
[5] Figure
fig.height
andfig.width
control plot size in inchesfig.cap
add a caption to a figure
[6] Table
- R data can be outputted into formatted tables
- Multiple packages can be used
– e.g. knitr, xtable or stargazer
– knitr::kable() is recommended
Example of knitr::kable():
(5) Text (i.e. markdown, HTML, latex)
a. 字体
b.标题
c.列举与表格
这里原课程所用的版本可能有点旧?反正我没试成功过这个表格d. 公式及其他
d. Code > inline
- R code can also be inserted within the text
- Format:
–code
code not evaluated + monospaced font
–r code
code evaluated + regular font
–r code
code evaluated + monospaced font
Example: