R
马志远的生信笔记
兰州大学博士后,李发弟教授团队
展开
-
非线性拟合之干物质降解率
评估饲料降解的公式是0rskov and McDonald (1979)提出来的,公式是dp= a + b (1- e ^(-kd*t))where dp,a, b,kd werethe ruminal disappearance at time t, therapidly soluble fraction, the potentiallydegradable fraction, the constant rate ofpotentially degradable fraction本人用原创 2021-12-22 16:27:57 · 357 阅读 · 0 评论 -
做一个类似热图的图
在论文中发现了这么一张图,我们准备用ggplot来做下这个图首先,数据准备好,做长型的用以下代码就可以出图了mydata<-read.table("clipboard",sep="\t",header=T)mydata$Module<-factor(mydata$Module,levels = c("M23","M22","M21","M20","M19","M18","M17","M16","M15","M14","M13","M12","M11","M10","M..原创 2021-12-21 21:41:36 · 594 阅读 · 0 评论 -
OTU表或微生物组成表数据归一化到100万
mydata<-read.table("clipboard",sep="\t",header=T,row.names = 1)trans<-t(t(mydata)/colSums(mydata))*1E6write.table(trans,file="out.xls",sep="\t") #导出数据行名会错一个原创 2021-08-18 14:47:55 · 739 阅读 · 0 评论 -
模型筛选代码
read.table(file="clipboard",sep="\t",header=T)->mydata#colnames(mydata)[c(2,4:14)]->v1 #sf#colnames(mydata)[3]->r1 #sfcolnames(mydata)[c(2,5:14)]->v1colnames(mydata)[4]->r1length(v1)->afml=paste(r1,"~",paste(v1,collapse="+"))fm.原创 2021-04-22 04:59:26 · 217 阅读 · 0 评论 -
获得module丰度信息
#获得微生物代谢相关module的丰度信息#DNA水平modulelibrary(data.table)library(dplyr)fread("/mnt/mzy/dairycow/72samples/geneset/annotation.emapper.annotations",header=F,sep="\t",fill=T)->alist=c("M00004", "M00006", "M00007", "M00008", "M00033", "M00165", "M00166", "M原创 2021-03-04 21:44:00 · 226 阅读 · 0 评论 -
根据samtool 统计的reads数计算tpm值
library(data.table)library(dplyr)#DNAtpm计算SMP<-c(123,98,126,4,128,32,134,8,130,29,131,42,119,19,139,24,125,37,144,14,117,5,142,13,127,22,129,48,138,47,140,20,132,18,149,73,120,105,148,44,122,30,124,26,143,34,133,12,145,21,152,10,118,102,147,80,121,1,原创 2021-03-02 10:31:26 · 687 阅读 · 1 评论 -
R之cast公式使用
cast用来替代透视表简直完美。我们有一个33列的数据表,数据保存在Dmean和Rmean中> colnames(tmp)[1] "ORF" "Dmean" "Rmean"[4] "seed_eggNOG_ortholog" "seed_ortholog_evalue" "seed_ortholog_score"[7] "best_tax_level" "Preferred_name" ...原创 2021-01-30 14:53:40 · 1764 阅读 · 0 评论 -
合并kraken结果
setwd("/mnt/mzy/dairycow/24samples/kraken")DNAsample<-c("127","134","133","143","140","122","125","117","149","132","130","142","145","152","118","147","121","151","135","136","150","146","141","137")system("awk -F '\t' '{print $1}' *DNA|sort|uniq >原创 2020-08-27 20:47:09 · 414 阅读 · 0 评论 -
beta多样性计算
file=c("salmonRNA.csv","salmonDNA.csv")for (j in file){ c<-read.csv(j,row.names=1,stringsAsFactors=FALSE) vegan::vegdist(t(c),method="bray") %>% ape::pcoa()->pcoa tmp<-if (is.null( pcoa$values$Rel_corr_eig)) pcoa$values$Relative_eig...原创 2020-08-26 19:53:25 · 4412 阅读 · 1 评论 -
alpha多样性计算
library(picante) alpha <- function(x, tree = NULL, base = exp(1)) { est <- estimateR(x) Richness <- est[1, ] Chao1 <- est[2, ] ACE <- est[4, ] Shannon <- diversity(x, index = 'shannon', base = base) Simpson <- diversity(...原创 2020-08-26 19:51:44 · 5812 阅读 · 0 评论 -
salmon合成丰度表
cd /mnt/mzy/dairycow/24samples/profilefor i in 127 134 133 143 140 122 125 117 149 132 130 142 145 152 118 147 121 151 135 136 150 146 141 137dosalmon quant -i /mnt/mzy/dairycow/profile/24index -p 20 --libType A -1 /mnt/mzy/dairycow/sub24/DNA/$i.1.fq -2原创 2020-08-26 19:47:38 · 353 阅读 · 0 评论 -
R操作合并数据等
library(data.table)setwd("/mnt/10t/mzy/48samples/16s/salmon/")fread("DNAabd138.csv")->DNAas.data.frame(DNA)->DNArow.names(DNA)<-DNA[,1]unique(DNA[,1])->namelistx<-c("")for (i in namelist){t(data.frame(colSums(DNA[which(DNA$Name==i),3原创 2020-06-09 20:03:25 · 205 阅读 · 0 评论 -
R查看列属性
很多时候一转,列成了文本,不能计算了,我们查看每列的类型,用mode或者class命令,千万不用用typeof,后者只会告诉你那是个double或者integer,这个命令会误导你。查看每一列的属性用sapply(x,mode)...原创 2020-06-09 17:26:07 · 2260 阅读 · 0 评论 -
RNA序列转DNA序列
silva数据库中用的是RNA序列,我们将其转换成DNA序列。因为大部分软件都支持正反链比对,我们没必要一定要负链序列,正链序列只要把U转换成T即可,利用R实现如下:read.table("SILVA_132_LSURef_tax_silva.fasta",header=F,sep="&")->rna #sep随便搞个文件里没有的符号,只要不让它变成2列就成for (i in 1:dim(rna)[1]){if (!(">" %in% rna[i,])){gsub("U","T"原创 2020-06-01 22:21:25 · 2841 阅读 · 0 评论 -
合并kraken的report文件,R
system("for i in *RNA*; do cat $i|awk '{print $1}'>>RNAnames; done")system("cat RNAnames|sort|uniq > uniq.names")RNAsample<-c("127","22","129","81","138","99","140","20","132","82","149","73","120","105","148","44","122","101","124","100",...原创 2020-06-01 18:02:14 · 536 阅读 · 0 评论 -
R循环中的赋值
先把主要函数放上,内容最后再补assign()get()原创 2020-05-24 00:30:57 · 863 阅读 · 0 评论 -
R和pandas实现透视表(pivot; cast/dcast/acast)和逆透视表(melt)过程
直接放代码R代码gc()library('magrittr')setwd("~/Documents/48sample/mag/")#合成丰度文件data.table::fread("DNA.geneabundance.txt(1)", sep="\t", header=T, stringsAsFactors=F) %>% data.frame()->profile...原创 2020-04-27 17:37:24 · 857 阅读 · 0 评论 -
bioconductor安装软件
install.package()经常有软件出现版本问题而不能安装,不如试下bioconductorsudo R #有时候会出现权限不够不能写入的问题,最后用sudosource("http://bioconductor.org/biocLite.R") biocLite("readr")...原创 2020-02-11 18:10:49 · 899 阅读 · 0 评论