单细胞蛋白组学|控制肽段FDR假阳性，快来使用Dart-ID软件吧

王元启的生信记录

已于 2024-06-18 10:52:26 修改

阅读量849

点赞数 10

分类专栏：单细胞蛋白组学数据处理文章标签： python 数据分析机器学习深度学习开源软件

于 2024-06-18 10:16:13 首次发布

本文链接：https://blog.csdn.net/wcy1995427/article/details/139765619

版权

单细胞蛋白组学数据处理专栏收录该内容

13 篇文章 0 订阅

订阅专栏

继续发光发热的一天，最近在看单细胞蛋白质数据处理，昨天分享了DIA-NN软件相关的使用方法，今天来学习一下肽段假阳性控制软件Dart-ID。

文章地址：https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007082。

源码地址：https://github.com/SlavovLab/DART-ID。

该软件是开发SCopE2分析方法的课题组进行开发的，在此基础上，他们课题组开发了Dart-ID来对肽段进行假阳性的控制，提高肽段鉴定的准确性。DART-ID 实施了原则性的贝叶斯框架，用于全局保留时间 (RT) 比对，并将 RT 估计值纳入肽谱匹配的置信度估计值中。实验证明，当应用于批量或单细胞样本时，DART-ID 在 1% FDR 下将数据点数量增加了 30-50%，从而减少了缺失数据。

使用方法：

1、在python>=3.7的环境下进行安装

pip install dart-id

2、如果python版本在3.8以上，使用该软件可能会遇到以下几个问题：

Error：cannot import name 'gcd' from 'fractions'
###将相应的代码进行修改为
from math import gcd
Error："DataFrame" object has no attribute "append"
###降低pandas版本到pandas== 1.5.3，运行成功

3、python 命令行进行Dart-id分析

dart_id -c config_files/example_data.yaml -o ./DART_ID/output

使用dart-id -h进行参数查看，-i ./evidence.txt; -o ./output; -c ./config_example.yaml。

usage: dart_id [-h] [-i INPUT [INPUT ...]] [-o OUTPUT] [-v] [--version] -c
                 CONFIG_FILE
optional arguments:
  -h, --help            show this help message and exit
  -i INPUT [INPUT ...], --input INPUT [INPUT ...]
                        Input file(s) from search engine output (e.g.,
                        MaxQuant evidence.txt). Not required if input files
                        are specified in the config file
  -o OUTPUT, --output OUTPUT
                        Path to output folder
  -v, --verbose
  --version             Display the program's version
  -c CONFIG_FILE, --config-file CONFIG_FILE
                        Path to config file (required). See example/config_example.