关于DNA 碱基序列检验的JAVA代码

最新推荐文章于 2024-05-08 10:19:57 发布

earthquake_aaa

最新推荐文章于 2024-05-08 10:19:57 发布

阅读量1.9k

点赞数 2

分类专栏：算法文章标签： DNA 碱基配对 java

本文链接：https://blog.csdn.net/u013123021/article/details/50858797

版权

This assignment focuses on arrays and file/text processing. Turn in a file named DNA.java. You will also need the two input files dna.txt and ecoli.txt from the course web site. Save these files in the same folder as your program. The assignment involves processing data from genome files. Your program should work with the two given input files. If you are curious (this is not required), the National Center for Biotechnology Information publishes many other bacteria genome files. The last page tells you how to use your program to process other published genome files.

Background Information About DNA:

Note: This section explains some information from the field of biology that is related to this assignment. It is for your information only; you do not need to fully understand it to complete the assignment.

Deoxyribonucleic acid (DNA) is a complex biochemical macromolecule that carries genetic information for cellular life forms and some viruses. DNA is also the mechanism through which genetic information from parents is passed on during reproduction. DNA consists of long chains of chemical compounds called nucleotides. Four nucleotides are present in DNA: Adenine (A), Cytosine (C), Guanine (G), and Thymine (T). DNA has a double-helix structure (see diagram below) containing complementary chains of these four nucleotides connected by hydrogen bonds. Certain regions of the DNA are called genes. Most genes encode instructions for building proteins (they're called "protein-coding" genes). These proteins are responsible for carrying out most of the life processes of the organism. Nucleotides in a gene are organized into codons. Codons are groups of three nucleotides and are written as the first letters of their nucleotides (e.g., TAC or GGA). Each codon uniquely encodes a single amino acid, a building block of proteins. The process of building proteins from DNA has two major phases called transcription and translation, in which a gene is replicated into an intermediate form called mRNA, which is then processed by a structure called a ribosome to build the chain of amino acids encoded by the codons of the gene.

The sequences of DNA that encode proteins occur between a start codon (which we will assume to be ATG) and a stop codon (which is any of TAA, TAG, or TGA). Not all regions of DNA are genes; large portions that do not lie between a valid start and stop codon are called intergenic DNA and have other (possibly unknown) functions. Computational biologists examine large DNA data files to find patterns and important information, such as which regions are genes. Sometimes they are interested in the percentages of mass accounted for by each of the four nucleotide types. Often high percentages of Cytosine (C) and Guanine (G) are indicators of important genetic data. For more information, visit the Wikipedia page about DNA: 点击打开链接点击打开链接

In this assignment you read an input file containing named sequences of nucleotides and produce information about them. For each nucleotide sequence, your program counts the occurrences of each of the four nucleotides (A, C, G, and T). The program also computes the mass percentage occupied by each nucleotide type, rounded to one digit past the decimal point. Next the program reports the codons (trios of nucleotides) present in each sequence and predicts whether or not the sequence is a protein-coding gene. For us, a protein-coding gene is a string that matches all of the following constraints*:

• begins with a valid start codon (ATG)

• ends with a valid stop codon (one of the following: TAA, TAG, or TGA)

• contains at least 5 total codons (including its initial start codon and final stop codon)

• Cytosine (C) and Guanine (G) combined account for at l

最低0.47元/天解锁文章

earthquake_aaa

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
关于DNA 碱基序列检验的JAVA代码

This assignment focuses on arrays and file/text processing. Turn in a file named DNA.java. You will also need the two input files dna.txt and ecoli.txt from the course web site. Save these files in
复制链接

扫一扫

专栏目录