Problem 62: DNA Sorting
Time Limit:1 Ms| Memory Limit:7 MB
Difficulty:1
Description
One measure of ``unsortedness'' in a sequence is the number of pairs of entries that are out of order with respect to each other. For instance, in the letter sequence ``DAABEC'', this measure is 5, since D is greater than four letters to its right and E is greater than one letter to its right. This measure is called the number of inversions in the sequence. The sequence ``AACEDGG'' has only one inversion (E and D)---it is nearly sorted---while the sequence ``ZWQM'' has 6 inversions (it is as unsorted as can be---exactly the reverse of sorted).
You are responsible for cataloguing a sequence of DNA strings (sequences containing only the four letters A, C, G, and T). However, you want to catalog them, not in alphabetical order, but rather in order of ``sortedness'', from ``most sorted'' to ``least sorted''. All the strings are of the same length.
You are responsible for cataloguing a sequence of DNA strings (sequences containing only the four letters A, C, G, and T). However, you want to catalog them, not in alphabetical order, but rather in order of ``sortedness'', from ``most sorted'' to ``least sorted''. All the strings are of the same length.
Input
The first line contains two integers: a positive integer n (0 < n <= 50) giving the length of the strings; and a positive integer m (0 < m <= 100) giving the number of strings. These are followed by m lines, each containing a string of length n.
Output
Output the list of input strings, arranged from ``most sorted'' to ``least sorted''. Since two strings can be equally sorted, then output them according to the orginal order.
Sample Input
Sample Input
Sample Input
10 6
AACATGAAGG
TTTTGGCCAA
TTTGGCCAAA
GATCAGATTT
CCCGGGGGGA
ATCGATGCAT
AACATGAAGG
TTTTGGCCAA
TTTGGCCAAA
GATCAGATTT
CCCGGGGGGA
ATCGATGCAT
Sample Output
CCCGGGGGGA
AACATGAAGG
GATCAGATTT
ATCGATGCAT
TTTTGGCCAA
TTTGGCCAAA
AACATGAAGG
GATCAGATTT
ATCGATGCAT
TTTTGGCCAA
TTTGGCCAAA
题目大意: 将字符串(DNA序列)按逆序数的顺序排列, 逆序数相同则按照输入顺
序输出。
思路:归并排序的一类用处(求逆序数),先按逆序数大小排序, 逆序数一样的按
照输入顺序排序, 统计逆序数时因为每次比较的数组都是有序的, 所以逆序数用(mid-
i+1)统计(见代码)
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <algorithm>
#define MAX 102
using namespace std;
typedef struct Elem
{
char tstr[MAX];
int num; //第几个
int nx; //逆序数
}elem;
elem data[MAX];
char str[MAX], temp[MAX];
int cnt = 0;
bool cmp(elem a, elem b)
{
if(a.nx != b.nx) //先按照逆序数排, 然后按输入顺序排
{
return a.nx < b.nx;
}
else
{
return a.num < b.num;
}
}
void merge(int left, int right) //归并排序求逆序数
{
if(left >= right)
{
return;
}
int mid = (left + right)/2, i, j, k;
merge(left, mid);
merge(mid+1, right);
i = left, j = mid+1, k = left;
while(i <= mid && j <= right)
{
if(str[i] <= str[j])
{
temp[k++] = str[i++];
}
else
{
cnt += (mid - i + 1); //temp[i] > temp[j], 则i~mid 都大于temp[i], 逆序数+=(mid - i + 1), 注意此处+1
temp[k++] = str[j++];
}
}
while(i <= mid)
{
temp[k++] = str[i++];
}
while(j <= right)
{
temp[k++] = str[j++];
}
for(i = left; i <= right; i++)
{
str[i] = temp[i];
}
}
int main()
{
int len, m, i;
scanf("%d%d", &len, &m);
for(i = 0; i < m; i++)
{
scanf("%s", data[i].tstr); //录入数据, 并刷新data数组
strcpy(str, data[i].tstr);
cnt = 0;
merge(0, len-1);
data[i].num = i;
data[i].nx = cnt;
// printf("%d\n", cnt);
}
sort(data, data + m, cmp);
for(i = 0; i < m; i++)
{
printf("%s\n", data[i].tstr); //打印结果
}
return 0;
}