原文转载至:
(http://www.longwoodgenomics.org/2014/10/11/levels-of-bioinformatics-research/#comments )
有删减,版权归原作者所有,仅做学习/交流使用。
Level 0
“modeling for modeling’s sake”.
This is totally OK if the scientists only consider themselves mathematicians, statisticians, computer scientists, or physicists since there are indeed many good theoretical modeling problems in their respective fields, but not OK if they are serious about bioinformatics or computational biology research.
Level 1
is analyzing unpublished data from their own lab or collaborators and trying to make novel biological findings.
The way to evaluate a Level 1 study is to see how complicated the data is (e.g. the total data volume and data types), whether the bioinformatician needs to create new algorithms or only use other people’s tools to analyze data (e.g. in the method section), how essential the bioinformatics analysis is to the overall project (e.g. how many figures were generated by the bioinformatician and whether main hypothesis is from an informatics analysis), whether the experimental and computational have real fruitful interactions (e.g. from a published paper, more cycles of experimental / computational result description suggest that experiments and computational analyses inform each other for the next step of experiments / analyses, in contrast to studies where all the data was generated first followed by bioinformatics analysis to summarize and integrate the data which sometimes don’t have real findings thus only have descriptive results and no experimental validation), whether there are real and significant biological findings in the study (from reading the abstract).
Level 2
is developing
- method to solve a general quantitative problem in big data studies that are especially relevant to biomedical research (e.g. Qvalue for FDR),
- computational algorithms for analyzing data from a new high throughput technique (e.g. RMA or Bowtie), or
- databases or resources for integrating many other public data (e.g. Oncomine).
I considered this a higher level of bioinformatics research since for a Level 1 project the bioinformatician only help their own collaborator, while a good Level 2 project can help thousands of other biologists. Usually these algorithms or resources should address an important and timely biological problem or technical challenge. They don’t have to be published in high profile place, and only time could tell their real significance based on usage and citations. The method may or may not be extremely novel (previously developed statistical or computational method applied to a new biological problem is sufficiently novel), but really has to work and be user friendly. The developers often need to take a lot of additional efforts after the initial publication to maintain and update the algorithm / resources even without future publications. The developers don’t necessarily get sufficient credit from the publication directly, but will do well (when their papers or grants get reviewed) by doing good to the community. Also, to do well in Level 2 research, bioinformaticians should stay focused on their biological domain, so they have good understanding of new computational methods or experimental techniques that are the most relevant or useful in their biological domain.
Level 3
is integrating public high throughput data in a smart way to make good biological findings, so the study often starts from public data and ends in experimental validations. This requires the bioinformatician to have solid biological knowledge, and can come up with their own interesting biological questions. The bioinformatician can lead a biological project where experimental collaborators trust the correctness and significance of the predictions to be willing to conduct experimental validation. Some Level 3 findings that are well designed can even be validated in silico, although unfortunately sometimes experimental biologists might not accept even a solid in silico validation. With more and more public data on resources like GEO, there will be increasing opportunities for level 3 research. These studies should be evaluated by whether the biological question is interesting, whether the integration is smart and sound, and often by the level of the journal where the study is published (as compared to pure experimental studies).
A bioinformatician in training should probably first learn the basic bioinformatics skills and start on Level 1 project, and move towards Level 2 and Level 3 projects as his / her biological understanding and computational techniques improve. As the bioinformatician matures and gains experiences over time, s/he should preferably have a balance of level 1, 2, and 3 projects, with the option of doing some Level X studies. In fact, if resources allow, it is probably healthier for an established bioinformatics PI to conduct research in all levels 1 to X than in just one level. There are also many bioinformaticians, including myself, who are starting to conduct experimental research and generate experimental data themselves. The experimental component of their research should be compared with other experimental biologists, and the informatics component of their research could still be evaluated according to the above 5 categories. **Next time, when you read genomics and bioinformatics papers, ask “what is the level of their bioinformatics work?” Try to evaluate the bioinformatics work objectively, instead of by the impact factor of the journal the study is published. **