基因表达谱的构建

小杜的生信筆記

已于 2023-05-15 19:49:58 修改

阅读量169

点赞数

文章标签： r语言学习信息可视化数据挖掘 linux

于 2023-05-15 19:45:09 首次发布

本文链接：https://blog.csdn.net/kanghua_du/article/details/130691363

版权

前言

大家好，我是小杜！最近，由于自己的事情比较多，公众号更新教程都是随机的。我也在前面的推文中说过，分享只是我学习中的一部分，自己也不是全职来做这块（PS：也做不下去），自己的推文仅仅只是记录自己学习过程。因此，只要自己有空闲的时间（PS:都是工作时间以外）才会来整理和分享相关的教程。毕竟做好本职工作是为“有能力”一直持续分享，是吧！

因此，针对大家的后台问题，大多数时间都是过很长时间才会回复，实在没有办法，独自一个人的精力有限，不可能全部顾及。但是，有那么一小批同学，在后台私信或留言实在是有点过分…，没法！我不可能会做到让每个人都满意，也不会这么做，因为自己能力有限。如果你觉得我分享的教程有用，你就持续关注；如果没用，那么就不关注就可以。

**对于做公众号的博主，都是有很强的分享精神。**但不是每个人都有这样无私精神！因此，请大家友好对待。对于，我们这些小的公众号而言，公众号中唯一有经济收入的就只是推广和有同学的赞赏（PS：因此，大家看到推广也不要太烦，这些推广都是基于这个公众号定位推的，大家需要可联系相关工作人员即可）。

物种基因表达谱

论文网址：

https://academic.oup.com/bioinformatics/article/33/15/2397/3096436?login=false

Github网址：

https://github.com/solgenomics/Tea/tree/master

这是今天找论文时，无意间看到的。但是，自己在很早以前就使用过这个网站。

http://tea.solgenomics.net/

基因表达谱网站的定位

对于我自己理解，这就是为了可视化每个基因在作物各个组织中的表达，方便我们做这块研究对某个基因的表达水平的评估。

点击进入Expression Viewer，可以看到如上界面内容。

共有4个数据集（基本都是果实发育至成熟阶段）

2018年发在NC的文章，我在以前上学时候组会上做过报告。2022年这篇，是我现在做的研究中查找过。中间绿色的部分是我还没找到的文章，也是今天一直在找。
在下面又包括了果实发育时期，Orange、Tiessues、Treatment,做的很全，你可以把你的做的相关的研究数据上传上去也可以，比如Tomato。
选择基因，在最上面也给输入基因的ID，BLAST或是Gene集。
可视化

基因表达水平

在组织中的可视化

Scatter Plot

热图

–
选择不同数据集，你获得结果也是不同的。

在干旱胁迫的数据中，有胁迫处理组和叶片的数据集。

是不是感觉很酷哦！！那么这种似的数据集要如何制作的呢！
我们可以看一下，作者相关的教程，也是分享在GitHub中，但是也需要有很强大的编程能力、绘图能力和生物信息学功底才可以完成。此外，这是团队合作才可以的，独自一个人我想很难吧！

题目：

The Tomato Expression Atlas

GitHub中的教程如何构建

网址：

https://github.com/solgenomics/Tea/tree/master

平台需求

Install Catalyst, Perl and R dependencies
This web tool was developed using the Perl framework Catalyst (http://www.catalystframework.org), so to run the application is necessary to install Perl, Catalyst and its dependencies.

Check this link in case of doubts installing Catalyst (http://www.catalystframework.org/#install).

To install Catalyst using cpanm, just execute: cpanm Catalyst::Devel

Also, if you are installing it in a new machine you maybe need to install cpanminus, gcc and make, and then some Perl dependencies like Catalyst, Lucy and Mason:

sudo aptitude install cpanminus
sudo aptitude install make
sudo aptitude install gcc
sudo aptitude install r-base
sudo aptitude install r-base-dev
sudo aptitude install postgresql
sudo aptitude install postgresql-server-dev-11    
cpanm -L ~/local-lib/ Catalyst::Devel
cpanm -L ~/local-lib/ Catalyst::Runtime
cpanm -L ~/local-lib/ Mason
cpanm -L ~/local-lib/ Statistics::R
cpanm -L ~/local-lib/ Catalyst::ScriptRunner
cpanm -L ~/local-lib/ Catalyst::Controller::REST
cpanm -L ~/local-lib/ Catalyst::View::HTML::Mason
cpanm -L ~/local-lib/ Lucy::Simple
cpanm -L ~/local-lib/ Array::Utils
cpanm -L ~/local-lib/ DBIx::Class
cpanm -L ~/local-lib/ Bio::Perl
cpanm -L ~/local-lib/ Bio::BLAST::Database
cpanm -L ~/local-lib/ DBD::Pg

If you are having trouble installing cpanm, there may be an issue with your system’s dependencies. Visit (https://library.linode.com/linux-tools/utilities/cpanm) for help with installing dependencies.

In case local-lib is not in the path, you have to add the following line in the .bashrc file (for a local-lib in your home)

 export PERL5LIB=/home/username/local-lib/lib/perl5:$PERL5LIB

You might also need to add the next line to your .bashrc

export PERL5LIB=$PERL5LIB:/home/username/path_to_tea/Tea/

Do not forget to source .bashrc to be sure these changes take effect.

R v3 must be installed for the interactive heatmap. The R libraries ‘d3heatmap’, ‘NOISeq’ and ‘htmlwidgets’ should also be installed.

Clone Github repository
Go to the TEA repository at GitHub (https://github.com/solgenomics/Tea) and copy the link to clone this repository.

Go to your terminal, to the folder where you want to clone this repository and use the next command (using the link copied from the web):

git clone git@github.com:solgenomics/Tea.git

git clone https://github.com/solgenomics/Tea.git

You can run the local server to check Catalyst is running fine. If you are running it on a server, you should also check that the Apache or Nginx configuration is correct and the ports are open on the firewall.

Go to the folder Tea, created when cloned the repository and run the server to check if all the dependencies are installed.

cd Tea/
script/tea_server.pl -r -d --fork

If you got an error, you will probably will need to go back to step one and install some dependencies.

Configuration file

dbhost localhost
dbname my_db
dbuser web_usr
dbpass password

expression_indexes_path /home/user/index_files/expression
correlation_indexes_path /home/user/index_files/correlation
loci_and_description_index_path /home/user/index_files/description

#path to mason folder to overwrite default front-end
<View::Mason>
  add_comp_root /home/user/path_to_new_mason_dir
</View::Mason>

nt_blastdb_path /home/user/blastdbs/cdna_file.fasta
prot_blastdb_path /home/user/blastdbs/prots_file.fasta
tmp_path /home/user/tea_tmp_files

default_gene gene_name

Create database
Install PostgreSQL, create a database to store your project metadata and import the schema to the database:

On postgres terminal:

CREATE DATABASE my_db;

On Linux terminal create the database schema importing the file create_tea_schema.sql from import_project folder:

psql –U postgres –d my_db –h localhost –a –f create_tea_schema.sql

Use TEA_project_template.txt and TEA_project_template_example.txt` from import_project to create your project import file

# Please use one line per field and one file per project. Do not edit or remove any line starting with #

#organism
organism_species: Solanum lycopersicum
organism_variety: M82
organism_description: Tomato M82
# organism - end

#project
project_name: S. lycopersicum M82 Fruit Development
project_contact: Jocelyn Rose
project_description: Fruit development from anthsis to red ripe for whole fruit and for the cell types from the pericarp obtained by Laser Capture Microdissected (LCM)
expr_unit: RPM
index_dir_name: tomato_index
# project - end


# figure --- All info needed for a cluster of images (usually includes a stage and all its tissues). Copy this block as many times as you need (including as many tissue layer blocks as you need).
figure_name: 10DPA Total Pericarp
conditions: condition 1, condition 2
# write figure metadata

#stage layer
layer_name: 10DPA
layer_description: Ten days post anthesis
layer_type: stage
bg_color:
layer_image: slm82_fruit_10dpa_bg.png
image_width: 250
image_height: 500
cube_ordinal: 10
img_ordinal: 10
organ: fruit
# layer - end

#tissue layer
layer_name: Total_Pericarp
layer_description:
layer_type: tissue
bg_color:
layer_image: cassava_leaf.png
image_width: 250
image_height: 500
cube_ordinal: 100
img_ordinal: 100
organ: fruit
# layer - end

# figure - end

后面还有，自己到GitHub中看吧。

https://tea.solgenomics.net/网址，真的是个宝藏网址，内容很多，需要自己去发现。