mcmctree

baseml.ctl

输入文件: 只有根节点 标记 @ 时间

   10   1
((((Ccry,Tpse),(Ptri,((Fcyl,Pmul),(nitz4,Nlae)))),(Ngad,Esil)),Igal)'@ 1.9';

      seqfile = ../single-copy.cds.ok.phy.trim * change!!! sequence data file name
      outfile = mlb        * change!!! main result file
     treefile = intree  * change!!! tree structure file name

        noisy = 3   * 0,1,2,3: how much rubbish on the screen
      verbose = 1   * 1: detailed output, 0: concise output
      runmode = 0   * 0: user tree;  1: semi-automatic;  2: automatic
                    * 3: StepwiseAddition; (4,5):PerturbationNNI 

        model = 7   * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85
        Mgene = 0   * 0:rates, 1:separate; 2:diff pi, 3:diff kapa, 4:all diff

    fix_kappa = 0
        kappa = 2   * initial or given kappa

    fix_alpha = 0 
        alpha = 0.5  * initial or given alpha, 0:infinity (constant rate)
       Malpha = 0   * 1: different alpha's for genes, 0: one alpha
        ncatG = 5   * # of categories in the dG, AdG, or nparK models of rates

      fix_rho = 1  
          rho = 0.  * initial or given rho,   0:no correlation
        nparK = 0   * rate-class models. 1:rK, 2:rK&fK, 3:rK&MK(1/K), 4:rK&MK 
      TipDate = 1 10000000 * change!!!    time unit (years)
        clock = 1   * 0: no clock, unrooted tree, 1: clock, rooted tree
        nhomo = 1   * 0 & 1: homogeneous, 2: kappa's, 3: N1, 4: N2
        getSE = 1   * 0: don't want them, 1: want S.E.s of estimates
 RateAncestor = 0   * (1/0): rates (alpha>0) or ancestral states (alpha=0)
    cleandata = 0  * remove sites with ambiguity data (1:yes, 0:no)?

grep -A 1 "Substitution " mlb
rgene_ gamma = a/a a/a^2

mcmctree0.ctl for ML analysis the substitution model

主要是usedata 参数
          seed = -100
       seqfile = ../single-copy.cds.ok.phy.trim
      treefile = ../intree
       outfile = out

         ndata = 1
       seqtype = 0  * 0: nucleotides; 1:codons; 2:AAs
       usedata = 3    * change!!!  0: no data; 1:seq like; 2:use in.BV; 3: out.BV
         clock = 2    * 1: global clock; 2: independent rates; 3: correlated rates
       RootAge = <1.9  * safe constraint on root age, used if no fossil for root.

         model = 0    * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85
         alpha = 0    * alpha for gamma rates at sites
         ncatG = 5    * No. categories in discrete gamma

     cleandata = 0    * remove sites with ambiguity data (1:yes, 0:no)?

       BDparas = 1 1 0    * birth, death, sampling
   kappa_gamma = 6 2      * gamma prior for kappa
   alpha_gamma = 1 1      * gamma prior for alpha
   rgene_gamma = 1 1.25   * change!!!  gamma prior for overall rates for genes
  sigma2_gamma = 1 4.5    * gamma prior for sigma^2     (for clock=2 or 3)

      finetune = 1: 0.1  0.1  0.1  0.01 .5  * auto (0 or 1) : times, musigma2, rates, mixing, paras, FossilErr

         print = 1
        burnin = 2000
      sampfreq = 2
       nsample = 20000

*** Note: Make your window wider (100 columns) before running the program.
mv out.BV in.BV

mcmctree_1.ctl 重复两次

          seed = -1
	  seqfile = ../single-copy.cds.ok.phy.trim 
      treefile = ../intree 
       outfile = out

         ndata = 1
       seqtype = 0  * 0: nucleotides; 1:codons; 2:AAs
       usedata = 2    * 0: no data; 1:seq like; 2:use in.BV; 3: out.BV
         clock = 3    * 1: global clock; 2: independent rates(recommend); 3: correlated rates
       RootAge = <1.9  * safe constraint on root age, used if no fossil for root.

         model = 0    * 0:JC69, 1:K80, 2:F81, 3:F84, 4:HKY85
         alpha = 0    * alpha for gamma rates at sites
         ncatG = 5    * No. categories in discrete gamma

     cleandata = 0    * remove sites with ambiguity data (1:yes, 0:no)?

       BDparas = 10 10 0    * birth, death, sampling
   kappa_gamma = 6 2      * gamma prior for kappa
   alpha_gamma = 1 1      * gamma prior for alpha

   rgene_gamma = 1 1.25   * gamma prior for overall rates for genes
  sigma2_gamma = 1 4.5    * gamma prior for sigma^2     (for clock=2 or 3)

      finetune = 1: 0.1  0.1  0.1  0.01 .5  * auto (0 or 1) : times, musigma2, rates, mixing, paras, FossilErr

         print = 1
        burnin = 8000
      sampfreq = 2
       nsample = 30000

*** Note: Make your window wider (100 columns) before running the program.

查看 out 文件 ,制作后验时间拟合曲线,check 结果

### OrthoFinder 工具功能及使用方法 #### 一、OrthoFinder 的主要功能 OrthoFinder 是一种用于比较基因组学研究的强大工具,其核心目标在于识别不同物种之间的正交群(orthogroups),并构建基于这些群体的系统发育树。通过该工具,研究人员可以深入了解基因家族的演化过程以及物种间的关系[^1]。 具体来说,OrthoFinder 提供以下几个方面的功能: - **鉴定直系同源基因簇**:通过对多个物种的蛋白质序列进行聚类分析,生成一组直系同源基因簇。 - **推断基因家族扩张与收缩**:利用统计模型评估哪些基因家族经历了显著的扩增或缩减事件。 - **构建系统发育树**:不仅支持物种水平上的进化关系重建,还允许针对特定基因家族建立独立的分子进化树。 此外,在实际应用过程中无需担心软件环境配置问题,因为可以通过在线平台如Galaxy生信云来运行此程序而不需要本地部署任何依赖项。 #### 二、基本操作流程概述 ##### 数据准备阶段 为了启动一次完整的计算任务,用户需准备好如下输入材料: 1. 来自各个待测生物体的所有预测蛋白编码区FASTA格式文件; 2. 可选参数设置文档; ##### 执行命令行界面(CLI)版本安装后的典型调用方式为例说明标准作业步骤: 假设当前工作目录下已经存在名为`proteins/`子目录存放有上述提到过的多套fasta形式表示的目标序列,则执行以下bash脚本即可完成初步处理环节: ```bash mkdir results && cd results orthofinder -f ../proteins/ ``` 这条指令将会自动下载必要的数据库资源(如果尚未缓存的话),接着依次经历相似度搜索、图论算法分团直至最终输出各类中间产物连带总结报告为止[^3]。 当整个运算结束后,默认会在指定根目录下面创建一个新的Results_*日期戳*命名空间保存全部成果物。其中包括但不限于以下几个重要组成部分: - `Orthogroups/*`: 存储着检测出来的所有可能组合而成的orthologous groups详情表单; - `Gene_Trees/*`: 对应于每一个单独cluster内部成员之间相互关联程度定量刻画所得phylogenetic trees集合 ; - `Species_Tree/*`: 综合考虑整体情况绘制出宏观层面反映各分类单元相对位置分布状况的大纲级otherspecies relationship diagram. 特别值得注意的是, 如果希望进一步挖掘那些仅限单一copy存在的universal marker genes以便后续开展更加精细的时间尺度校准等工作时, 则可以直接查阅位于前述提及过的位置中的专门清单文件:`Orthogroups_SingleCopyOrthologues.txt`. #### 三、高级特性探索实例分享 考虑到某些特殊场景需求可能会涉及到额外定制化选项调整的情形之下,这里给出一段示范性质较强的shell scripting片段用来展示如何提取感兴趣的single copy orthologue sequences进而打包成适合下游paml-mcmctree pipeline所需的标准化input format: ```bash #!/bin/bash # 定义变量简化路径书写复杂度 mypath=~/your_project_directory_here OrthofinderPATH=${mypath}/OrthoFinder/Results_Jun12 # 创建新工作区域切换进入其中继续操作 mkdir mcmctreeAnalysis && cd mcmctreeAnalysis # 复制必要资料过来备用 cp ${OrthofinderPATH}/Orthogroups/Orthogroups_SingleCopyOrthologues.txt . cp ${OrthofinderPATH}/Orthogroups/Orthogroups.txt . cp ${OrthofinderPATH}/Species_Tree/SpeciesTree_rooted.txt . # (此处省略若干关于解析文本内容定位对应原始faa记录的具体实现细节...) ``` 以上仅为示意框架结构设计思路仅供参考学习之目的,请根据实际情况灵活修改适应各自项目特点的要求. ---
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值