题目:
DNA Sorting
Time Limit: 1000MS | Memory Limit: 10000K | |
Total Submissions: 24201 | Accepted: 9326 |
Description
One measure of ``unsortedness'' in a sequence is the number of pairs of entries that are out of order with respect to each other. For instance, in the letter sequence ``DAABEC'', this measure is 5, since D is greater than four letters to its right and E is greater than one letter to its right. This measure is called the number of inversions in the sequence. The sequence ``AACEDGG'' has only one inversion (E and D)---it is nearly sorted---while the sequence ``ZWQM'' has 6 inversions (it is as unsorted as can be---exactly the reverse of sorted). You are responsible for cataloguing a sequence of DNA strings (sequences containing only the four letters A, C, G, and T). However, you want to catalog them, not in alphabetical order, but rather in order of ``sortedness'', from ``most sorted'' to ``least sorted''. All the strings are of the same length.
Input
The first line contains two integers: a positive integer n (0 < n <= 50) giving the length of the strings; and a positive integer m (0 < m <= 100) giving the number of strings. These are followed by m lines, each containing a string of length n.
Output
Output the list of input strings, arranged from ``most sorted'' to ``least sorted''. Since two strings can be equally sorted, then output them according to the orginal order.
Sample Input
10 6 AACATGAAGG TTTTGGCCAA TTTGGCCAAA GATCAGATTT CCCGGGGGGA ATCGATGCAT
Sample Output
CCCGGGGGGA AACATGAAGG GATCAGATTT ATCGATGCAT TTTTGGCCAA TTTGGCCAAA
Source
这道题目的解决方案是先在输入字符串的时候就求出每个字符串的逆序对的个数。然后根据逆序对的个数从小到大进行排序。在计算一个字符串的逆序对的个数时,记录当前每个字符A,G,C,T各出现的次数,在新输入一个字符时,看比它大的字符出现的次数,并累加到逆序对个数中,字符串输入结束,逆序对个数即求出。排序采用的是计数排序,因为一个字符串的最大长度是50,因此逆序对的个数不可能超过50*(1+50)/2。
源代码如下:
#include <iostream>
using namespace std;
#define MAXN 50
#define MAXM 100
#define MAXSORTEDNESS 1276
struct hElem
{
int sortVal;
int index;
};
static char T[MAXM][MAXN];
static hElem H[MAXM];
static int temp[MAXSORTEDNESS];
static hElem R[MAXM];
static char L[]={'A','C','G','T'};
inline int pos(char c)
{
for(int i=0;i<sizeof(L)/sizeof(L[0]);i++)
if(L[i] == c)
return i;
return -1;
}
int main(int argc,char **argv)
{
int n,m;
cin>>n>>m;
int i,j,k;
int cnt[4];
for(i=0;i<m;i++)
{
H[i].sortVal = 0;
H[i].index = i;
memset(cnt,0,sizeof(cnt));
for(j=0;j<n;j++)
{
cin>>T[i][j];
int p = pos(T[i][j]);
cnt[p]++;
for(k=p+1;k<sizeof(L)/sizeof(L[0]);k++)
H[i].sortVal += cnt[k];
}
//cout<<"sortedness of "<<i<<": "<<H[i].sortVal<<endl;
}
for(j = 0; j < m; j++) temp[H[j].sortVal]++;
for(j = 1; j < MAXSORTEDNESS; j++) temp[j] += temp[j - 1];
for(j = m - 1; j >= 0; j--)
{
R[temp[H[j].sortVal] - 1] = H[j];
temp[H[j].sortVal]--;
}
for(i=0;i<m;i++)
{
for(j=0;j<n;j++)
cout<<T[R[i].index][j];
cout<<endl;
}
return 0;
}