Basic R

System Settings

系统报错改为英文Sys.setenv(LANGUAGE = "en")

禁止转化为因子options(stringsAsFactors = FALSE)

windowsFonts() View all available fonts in windows.

Data Wrangling

计数

table()等价于dplyr::count()，但是返回值形式有不同。前者返回一个含有name属性的向量，后者返回一个数据框。

base::table(iris$Species)


    setosa versicolor  virginica 
        50         50         50

iris |> dplyr::count(Species)

     Species  n
1     setosa 50
2 versicolor 50
3  virginica 50

将行名转为列

df |> tibble::rownames_to_column(var = 'xxx')

返回一个data.frame，不会自动转为tibble。

将列转为行名

tbl |> tibble::column_to_rownames(var = 'xxx')

可以输入一个data.frame或一个tibble，均返回一个data.frame，因为tibble不能含有行名。

Remove Zeros / NAs

Remove all rows containing zeros or NAs.

data(mtcars)
head(mtcars)

                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

mtcars[mtcars == 0] <- NA
head(mtcars)

                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46 NA  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02 NA  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1 NA    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02 NA NA    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1 NA    3    1

mtcars_clean <- tidyr::drop_na(mtcars)
mtcars_clean

                mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Datsun 710     22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1
Fiat 128       32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
Honda Civic    30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
Toyota Corolla 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
Fiat X1-9      27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
Lotus Europa   30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
Volvo 142E     21.4   4 121.0 109 4.11 2.780 18.60  1  1    4    2

`na.omit()`

na.omit() 等价于tidyr::drop_na()，但是会在变量中保存一个na.action的属性，可以在attributes(df)$na.action查看。使用df <-data.frame(df)可删除该属性。不推荐使用na.omit()。

Cumulative Calculation

Perform cumulative calculation using a binary function on a vector or list.

The following code takes the intersection of three vectors a, b, and c.

myList <- list(a = c(1:5),
               b = c(2:6),
               c = c(3:7))
myList

$a
[1] 1 2 3 4 5

$b
[1] 2 3 4 5 6

$c
[1] 3 4 5 6 7

purrr::reduce(myList, dplyr::intersect)

[1] 3 4 5

浮点数运算

浮点数运算后可能得到一个奇怪的小数，如1.183081e-13之类，使用round(digits = )取所需要的小数位数即可。

split strings

base::strsplit()等价于stringr::str_split()，两者都返回一个list。

base::strsplit()[[1]]等价于stringr::str_split_1()，两者都返回一个vector。

`apply()` function family

do.call(function, list)执行1次function函数，该函数的参数是list的所有元素。例如有一个list，里面是数个data.frame，将这个list合并成一个总的data.frame。

do.call("rbind", myList)

apply(x, margin, function)x为矩阵或数组。对x的每一行或者每一列执行function函数。margin为1时代表按行运算，为2时代表按列运算。
lapply(list, function)对list的每一个元素执行function函数，即list有多少元素，function就被执行了多少次。返回一个列表。
sapply(list, function)类似lapply，但是返回值是向量、矩阵或数组，而非列表。相当于do.call(cbind, lapply(list, function))
vapply()用于确保返回值的长度和类型，不常用。

R packages

不建议使用ggpubr，显著性检验结果可能存在问题。显著性检验建议使用ggsignif 。

Read and Save Data

# read csv
readr::read_csv(file.path(dir, 'file_name.csv'), col_names = TRUE)
# read xlsx
readxl::read_xlsx(file.path(dir, 'file_name.xlsx'))
# data.table
data.table::fread()
# write csv
readr::write_csv(myTibble, file.path(dir, 'file_name.csv'))

# read rds
readr::read_rds(file)
# write rds
readr::write_rds(x = mtcars, file)