uva 10163 Phylogenetic Trees Inherited (状态压缩+贪心)

Problem D: Phylogenetic Trees Inherited

Among other things, Computational Molecular Biology deals with processing genetic sequences. Considering the evolutionary relationship of two sequences, we can say that they are closely related if they do not differ very much. We might represent the relationship by a tree, putting sequences from ancestors above sequences from their descendants. Such trees are called phylogenetic trees. 
Whereas one task of phylogenetics is to infer a tree from given sequences, we'll simplify things a bit and provide a tree structure - this will be a complete binary tree. You'll be given the nleaves of the tree. Sure you know, n is always a power of 2. Each leaf is a sequence of amino acids (designated by the one-character-codes you can see in the figure). All sequences will be of equal length l. Your task is to derive the sequence of a common ancestor with minimal costs.

Amino Acid  
AlanineAlaA
ArginineArgR
AsparagineAsnN
Aspartic AcidAspD
CysteineCysC
GlutamineGlnQ
Glutamic AcidGluE
GlycineGlyG
HistidineHisH
IsoleucineIleI
 
Amino Acid  
LeucineLeuL
LysineLysK
MethionineMetM
PhenylalaninePheF
ProlineProP
SerineSerS
ThreonineThrT
TryptophanTrpW
TyrosineTyrY
ValineValV

The costs are determined as follows: every inner node of the tree is marked with a sequence of length l, the cost of an edge of the tree is the number of positions at which the two sequences at the ends of the edge differ, the total cost is the sum of the costs at all edges. The sequence of a common ancestor of all sequences is then found at the root of the tree. An optimal common ancestor is a common ancestor with minimal total costs.

Input Specification

The input file contains several test cases. Each test case starts with two integers n and l, denoting the number of sequences at the leaves and their length, respectively. Input is terminated by n=l=0. Otherwise, 1<=n<=1024 and 1<=l<=1000. Then follow n words of length l over the amino acid alphabet. They represent the leaves of a complete binary tree, from left to right.

Output Specification

For each test case, output a line containing some optimal common ancestor and the minimal total costs.

Sample Input

4 3
AAG
AAA
GGA
AGA

4 3
AAG
AGA
AAA
GGA

4 3
AAG
GGA
AAA
AGA

4 1
A
R
A
R

2 1
W
W

2 1
W
Y

1 1
Q

0 0

Sample Output

AGA 3
AGA 4
AGA 4
R 2
W 0
Y 1
Q 0

虽然字符串长度为L,但其实每个字符之间互相独立,所以完全可以看成是第四个样例那种只考虑一个字符的情况。

当左右儿子是一样时,贪心选择父亲和他们也一样。当不一样时,父亲到底选两个中的哪一个,这里可以用dp求解,也可以先标记一下,这两个都可以选,放到后面决定。感觉实际上可以根据统计的频率来判断,不过那样处理统计频率会麻烦一点。用一个int状态压缩一下当前可以选的字符有哪些,根绝与运算判断左右儿子是否相同,不同就取他们俩的并。vector存储这个节点的L个int,dfs实现。

#include<cstdio>
#include<map>
#include<queue>
#include<cstring>
#include<iostream>
#include<algorithm>
#include<vector>
#include<list>
#include<set>
#include<cmath>
using namespace std;
const int maxn = 100 + 5;
const int INF = 1e9;
const double eps = 1e-6;
typedef unsigned long long ULL;
typedef long long LL;
typedef pair<int, int> P;
#define fi first
#define se second

int n, l;
int cost;

int Hash(char s){
    return 1<<(s-'A');
}

vector<int> dfs(int dep){
    vector<int> ret, tem1, tem2;
    ret.clear();
    if(dep == 1){
        string s;
        cin >> s;
        for(int i = 0;i < l;i++){
            ret.push_back(Hash(s[i]));
        }
        return ret;
    }
    tem1 = dfs(dep/2);
    tem2 = dfs(dep/2);
    for(int i = 0;i < l;i++){
        int choose = tem1[i]&tem2[i];
        if(choose==0){
            cost++;
            ret.push_back(tem1[i]|tem2[i]);
        }
        else{
            ret.push_back(choose);
        }
    }
    return ret;
}

int main(){
    while(cin >> n >> l){
        if(n == 0 && l == 0)
            break;
        cost = 0;
        vector<int> ans = dfs(n);
        for(int i = 0;i < l;i++){
            for(int j = 0;j < 30;j++){
                if(ans[i]&1){
                    cout << (char)('A'+j);
                    break;
                }
                ans[i] /= 2;
            }
        }
        cout << ' ' << cost << endl;
    }
    return 0;
}


  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值