摘要:
Transcriptionfactors (TFs) are the core sentinels of gene regulation functioning by bindingto highly specific DNA sequences to activate or repress the recruitment ofRNA polymerase. The ability to identify transcription factor binding sites(TFBSs) is necessary to understand gene regulation and infer regulatory networks.Despite the fact that bioinformatics tools have been developed for years toimprove computational identification of TFBSs, the accurate prediction stillremains changeling as DNA motifs recognized by TFs are typically short andoften lack obvious patterns. In this study we introduced a new attribute-motifdistribution pattern (MDP) to assist in TFBS prediction. MDP was developedusing a TF distribution pattern curve generated by analyzing 25 yeast TFs and37 of their experimentally validated binding motifs, followed by calculatinga scoring value to quantify the reliability of each motif prediction. Finally,MDP was tested using another set of 7 TFs with known binding sites to in silico validate the approach. Themethod was further tested in a non-yeast system using the filamentous fungus Magnaporthe oryzae transcription factorMoCRZ1. We demonstrate superior prediction reranking results using MDP over thecommonly used program MEME and the other four predictors. The data showedsignificant improvements in the ranking of validated TFBS and provides a moresensitive statistics based approach for motif discovery.
展开