- 博客(25)
- 收藏
- 关注
原创 Unsupervised learning notes
Unsupervised LearningDimensionality ReductionPCAFind the features that capture most of the data points.(https://www.youtube.com/watch?v=HMOI_lkzW08)from sklearn.preprocessing import StandardScalerfrom sklearn.decomposition import PCAfrom sklearn.d
2022-04-24 16:07:13 790
原创 Evaluate ML Models Notes
Evaluate ModelsDummy ModelsManually assign the target values to the target while ignoring any feature. There are multiple ways: 1. Assign the most frequent 2. Choose the preferred result 3. Stratified predictionIt serves as the baseline to be compared w
2022-04-21 13:30:33 1730
原创 Applied Machine Learning Notes
Applied Machine LearningIntroductionTermsA dataset that is intended to be analyzed by machine learning method is supposed to have Feature (X) and Target Value/Label (y)Traning and test sets are needed for a given dataset.Model fitting will produce a ‘
2022-04-17 15:56:14 1143
原创 Simple machine learning workflow
选自coursera applied machine learning的assignment 1一个简单的用knn做癌症分型的题目import numpy as npimport pandas as pdfrom sklearn.datasets import load_breast_cancer# 载入cancer, 是一个dictcancer = load_breast_cancer()# 分类X输入,y输出X = cancerdf[cancer['feature_names']]
2022-04-08 14:20:11 1298
原创 简单Matplotlib流程
正常Matplotlib流程import matplotlib.pyplot as pltx = [1,2,3,4]y = [5,6,7,8]# 设置纸张大小plt.figure(figsize=(4,3),facecolor=None) # subplotplt.subplot(111)plt.plot(x,y,
2022-04-06 11:45:33 1765
原创 R语言绘制表达矩阵的热图与火山图
火山图仅需要pval 和log2FoldChangev_circ$color <- ifelse(v_circ$pval <0.05 & abs(v_circ$log2FoldChange)>= 1,ifelse(v_circ$log2FoldChange > 1,'red','blue'),'gray')color <- c(red = "red",gray = "gray",blue = "blue")circ_vo <- ggplot(v_circ
2021-09-08 22:49:41 1279
原创 Octave basic operation
~= %not equal&& %and|| %ora=3 %semicolon supressing outputb=‘hi’disp(a) %display value adisp(sprint(‘2 decimals: %0.2f’,a)) %display 2 digitsA = [1 2; 3 4; 5 6]v = 1:0.1:2v = 1:6ones(2,3)C = 2*ones(2,3
2021-09-06 15:04:49 97
原创 获得GSEA软件需要的表达矩阵
ex <- rownames_to_column(ex)ex$DESCRIPTION <- 'na'ex <- relocate(ex, DESCRIPTION, .after = rowname)write.table(ex, 'ex.txt', sep = '\t',quote = FALSE,row.names = F)GSEA 软件需要输入有一列DESCRIPTION。可以用‘NA’填充。
2021-08-05 12:15:44 419
原创 如何计算表达矩阵的correlation coefficient
x_t <- t(x)y_t <- t(y)cor_result <- as.data.frame(cor(x,y))cor_result <- rownames_to_column(cor_result)cor_result <- pivot_longer(cor_result, -rowname)library(ggcorrplot)这个包可以用于可视化相关性矩阵的结果
2021-08-05 11:54:51 619
原创 clusterprofiler进行富集分析
library(readr)library(topGO)library(KEGGREST)library(clusterProfiler)library(org.Hs.eg.db)library(tidyverse)test1 = bitr(gene, fromType="SYMBOL", toType=c("ENSEMBL", "ENTREZID"), OrgDb="org.Hs.eg.db")ego_ALL <- enrichGO(gene = test1$ENTREZID,
2021-07-19 14:38:17 862
原创 如何下载使用miranda
Mirandawget http://cbio.mskcc.org/microrna_data/miRanda-aug2010.tar.gztar zxvf miRanda-aug2010.tar.gzcd miRanda-3.3a/./configuremake install最好在linux环境下安装miranda 的input文件为fasta filemiranda 1.fasta 2.fasta -out results.txt grep '>>' results &
2021-02-02 13:13:53 1724 3
原创 Biomart 来转换gene id获得gene信息
Biomarthttps://m.ensembl.org/biomart/martview/左侧filters 选择Gene Names, input external references ID list .左侧attribute 选择需要导出的信息。 可以选不同的名称, GO等
2021-02-02 12:58:09 750
原创 R 从有多个序列的fasta文件中提取目标序列 seqinr
R seqinr从一个有多个序列的fasta文件中, 批量选择所需要的文件library(seqinr)all_fasta <- read.fasta('fasta.fasta')#这一步把名字换成想要的样子names(all_fasta) <- gsub(":.*", "", names(all_lncfasta))#选取sub_fasta <- all_fasta[names(all_fasta) %in% target_list$name]# 写出文件wri
2021-02-02 10:49:09 3709 1
原创 R gsub 改变列里的名字,删除一个符号之后的所有内容
R gsuba <- gsub("\\..*", "", a)\ escape the special characrer ..* meanseverything after . will be replaced by “”
2021-02-02 09:38:14 3452
原创 R 用mygene转换gene id
My gene 转换idconverter <- queryMany(gene_list, scope = "symbol", fields= c('ensembl.gene'), species = 'human')gene_id_after_converter <- unlist(converter$ensembl)unlist 用于解开返回的id name 值
2021-02-02 09:21:03 586
原创 R包安装失败
R 安装包失败尝试安装二进制版本。BiocManager::install(“rtracklayer”, type = ‘binary’)
2021-02-02 08:50:17 1242
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人