【Jason's_ACM_解题报告】Average

Average

A DNA sequence consists of four letters, A, C, G, and T. The GC-ratio of a DNA sequence is the number of Cs and Gs of the sequence divided by the length of the sequence. GC-ratio is important in gene finding because DNA sequences with relatively high GC-ratios might be good candidates for the starting parts of genes. Given a very long DNA sequence, researchers are usually interested in locating a subsequence whose GC-ratio is maximum over all subsequences of the sequence. Since short subsequences with high GC-ratios are sometimes meaningless in gene finding, a length lower bound is given to ensure that a long subsequence with high GC-ratio could be found. If, in a DNA sequence, a 0 is assigned to every A and T and a 1 to every C and G, the DNA sequence is transformed into a binary sequence of the same length. GC-ratios in the DNA sequence are now equivalent to averages in the binary sequence.


For the binary sequence above, if the length lower bound is 7, the maximum average is 6/8 which happens in the subsequence [7,14]. Its length is 8, which is greater than the length lower bound 7. If the length lower bound is 5, then the subsequence [7,11] gives the maximum average 4/5. The length is 5 which is equal to the length lower bound. For the subsequence [7,11], 7 is its starting index and 11 is its ending index.


Given a binary sequence and a length lower bound L, write a program to find a subsequence of the binary sequence whose length is at least L and whose average is maximum over all subsequences of the binary sequence. If two or more subsequences have the maximum average, then find the shortest one; and if two or more shortest subsequences with the maximum average exist, then find the one with the smallest starting index.


Input 
Your program is to read from standard input. The input consists of T test cases. The number of test cases T is given in the first line of the input. Each test case starts with a line containing two integers n (1<=n<=100,000) and L (1<=L<=1,000) which are the length of a binary sequence and a length lower bound, respectively. In the next line, a string, binary sequence, of length n is given.


Output 
Your program is to write to standard output. Print the starting and ending index of the subsequence.


The following shows sample input and output for two test cases.


Sample Input 

17 5 
00101011011011010 
20 4 
11100111100111110000


Sample Output 
7 11 
6 9



这道题对我来说很难,因为头一次接触类似几何问题的题目,一时摸不到头绪,不过解法将题目抽象成图形上两点间直线的斜率最大值问题,这是头一次接触。

这是一种典型的数形结合问题,为最大平均值,无论是IOI还是ACM都出过类似的题目,看Liu的讲解后我理解的并不是很透彻,于是翻出来周源的那一篇《浅谈数形结合思想在信息学竞赛中的应用》仔细地读了一下,颇有收获,所以。这道题是怎样的一个思路以及是如何实现的,我就不献丑讲解了,大家可以参阅刚刚介绍的文章。





附代码如下:
#include<cstdio> 

using namespace std;

#define MAXN (100000+5)

int n,L,sum[MAXN],p[MAXN];
char str[MAXN];

int compare_average(int x1,int x2,int x3,int x4){
	return (sum[x2]-sum[x1-1])*(x4-(x3-1))-(sum[x4]-sum[x3-1])*(x2-(x1-1));
}

int main(){
	int T;
	scanf("%d",&T);
	while(T--){
		scanf("%d%d%s",&n,&L,str+1);
		sum[0]=0;
		for(int i=1;i<=n;i++)sum[i]=sum[i-1]+str[i]-'0';
		
		int ansL=1,ansR=L;
		int i=0,j=0;
		for(int t=L;t<=n;t++){
			while(j-i>1&&compare_average(p[j-2],t-L,p[j-1],t-L)>=0)j--;
			p[j++]=t-L+1;
			while(j-i>1&&compare_average(p[i],t,p[i+1],t)<=0)i++;
			
			int comp=compare_average(p[i],t,ansL,ansR);
			if(comp>0||(comp==0&&t-p[i]<ansR-ansL)){
				ansL=p[i];ansR=t;
			}
		}
		printf("%d %d\n",ansL,ansR);
	}
	return 0;
}

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值