It is said that if you give an infinite number of cows an infinite number of heavy-duty laptops (with very large keys), that they will ultimately produce all the world's great palindromes. Your job will be to detect these bovine beauties.
Ignore punctuation, whitespace, numbers, and case when testing for palindromes, but keep these extra characters around so that you can print them out as the answer; just consider the letters `A-Z' and `a-z'.
Find the largest palindrome in a string no more than 20,000 characters long. The largest palindrome is guaranteed to be at most 2,000 characters long before whitespace and punctuation are removed.
PROGRAM NAME: calfflac
INPUT FORMAT
A file with no more than 20,000 characters. The file has one or more lines which, when taken together, represent one long string. No line is longer than 80 characters (not counting the newline at the end).SAMPLE INPUT (file calfflac.in)
Confucius say: Madam, I'm Adam.
OUTPUT FORMAT
The first line of the output should be the length of the longest palindrome found. The next line or lines should be the actual text of the palindrome (without any surrounding white space or punctuation but with all other characters) printed on a line (or more than one line if newlines are included in the palindromic text). If there are multiple palindromes of longest length, output the one that appears first.
SAMPLE OUTPUT (file calfflac.out)
11 Madam, I'm Adam
题目挺简单,求字符串的最大回文子串,但是子串中可能有噪音(非字符符号、数字、空格、换行等)。要输出回文子串的长度(去掉非字母字符后的)以及回文子串(将噪音还原后输出)。
最长回文子串,第一个念头是Brute Force。。。但是麻烦,而且时间复杂度大概为:N^3
后选择DP。
直接上代码
import java.io.BufferedReader; import java.io.FileReader; import java.io.FileWriter; import java.io.IOException; public class calfflac { /** * @param args * @throws IOException * calculate the longest palindrome substring length and output * the original subString. */ public static void main(String[] args) throws IOException { // TODO Auto-generated method stub BufferedReader br = new BufferedReader(new FileReader("calfflac.in")); FileWriter fout = new FileWriter("calfflac.out"); int n = br.read(); // String lines = br.readLine(); StringBuffer sb = new StringBuffer(); while (n != -1) { sb.append((char) n); n = br.read(); } String sF = sb.toString(); System.out.println(sF.length()); String sR = sb.reverse().toString(); sb.reverse(); long start = System.currentTimeMillis(); int[] result = dp_lcss(sF, sR); long end = System.currentTimeMillis(); System.out.println(end-start); StringBuffer sb2 = new StringBuffer(); for(int i = 0;i<result[0]&&result[1]>0;) { sb2.append(sR.charAt(result[1]-1)); if(isAlpha(sR.charAt(result[1]-1))) { i++; } result[1]--; } fout.write(result[0]+"\n"); fout.write(sb2.toString() + "\n"); fout.flush(); fout.close(); br.close(); System.exit(0); } // modified dp longest common substring public static int[] dp_lcss(String a, String b) { long start = System.currentTimeMillis(); if (a == null || b == null) { return new int[] {0,0}; } int[] section = new int[2]; a = " " + a; b = " " + b; int[][] c = new int[a.length()][b.length()]; long end = System.currentTimeMillis(); for (int i = 0; i < a.length(); i++) { c[i][0] = 0; } for (int j = 0; j < b.length(); j++) { c[0][j] = 0; } int max = 0; int interval_a = 1; int interval_b = 1; int cal_tmp = 0; for (int i = 1; i < a.length();i++) { if (!isAlpha(a.charAt(i))) { interval_a++; continue; } for (int j = 1; j < b.length();j++) { if (!isAlpha(b.charAt(j))) { interval_b++; continue; } cal_tmp = Math.abs(a.charAt(i) - b.charAt(j)); if (cal_tmp==0||cal_tmp==32) { c[i][j] = c[i - interval_a][j - interval_b] +1; if (c[i][j] > max) { max = c[i][j]; section[0] = max; section[1] = j; } } else { c[i][j] = 0; } interval_b = 1; } interval_a = 1; interval_b = 1; } System.out.println(end-start); return section; } private static boolean isAlpha(char c) { // TODO Auto-generated method stub if ((c <= 'z' && c >= 'a') || (c <= 'Z' && c >= 'A')) { return true; } return false; } }
中间参量interval_a和interval_b为跳跃间隔,当出现非字母字符时,进行跳跃。
测试结果都没问题,就是CPU时间超了。其中提供了一组测试数据,为一个近4000字节的代码片段,本机测试为200多ms,但是提交之后就会达到4s,通常都再1.4s左右。看来只能使用后缀树等其他方法进行计算,O(N^2)已经不能满足它了。。。
又思维定势了,刚参考了别人的解题思路,用的是我在不会DP时采用的笨方法:双向比较的方法。等面试过后手写下程序看看复杂度如何,还有就是证明下在下次循环时以上次比较的边界为起点的正确性。
import java.io.BufferedReader; import java.io.FileReader; import java.io.FileWriter; import java.io.IOException; public class calfflac { /** * @param args * @throws IOException * calculate the longest palindrome substring length and output * the original subString. */ public static void main(String[] args) throws IOException { // TODO Auto-generated method stub long start = System.currentTimeMillis(); BufferedReader br = new BufferedReader(new FileReader("calfflac.in")); FileWriter fout = new FileWriter("calfflac.out"); int n = br.read(); // String lines = br.readLine(); StringBuffer sb = new StringBuffer(); while (n != -1) { sb.append((char) n); n = br.read(); } // String sF = sb.toString(); System.out.println(sb.length()); // String sR = sb.reverse().toString(); // sb.reverse(); // int[] result = dp_lcss(sF, sR); int result[] = bd_greedy(sb.toString()); StringBuffer sb2 = new StringBuffer(); /*for(int i = 0;i<result[0]&&result[1]>0;) { sb2.append(sR.charAt(result[1]-1)); if(isAlpha(sR.charAt(result[1]-1))) { i++; } result[1]--; }*/ System.out.println(result[0]+" "+result[1]); int i = result[0]/2; int j = (result[0]+1)/2; int tmp =result[1]; while(i>=0&&tmp>0){ if(isAlpha(sb.charAt(i))) tmp--; sb2.insert(0,sb.charAt(i)); i--; } if((result[0]&1)==0&&isAlpha(sb.charAt(result[0]/2))){ j++; tmp =result[1]-1; }else{ tmp =result[1]; } while(j<sb.length()&&tmp>0){ if(isAlpha(sb.charAt(j))) tmp--; sb2.append(sb.charAt(j)); j++; } fout.write(result[1]*2-(1-(result[0]&1))+"\n"); fout.write(sb2.toString() + "\n"); fout.flush(); fout.close(); br.close(); long end = System.currentTimeMillis(); System.out.println(end-start); System.exit(0); } private static int[] bd_greedy(String sF) { // TODO Auto-generated method stub int[] section = new int[2]; if (sF == null ) { return section; } int counter = 0; int i=0; int j = 0; for(int k =0;k<sF.length()*2-1&&(j<sF.length());k++){ i = k/2; j=(k+1)/2; while(i>=0&&j<sF.length()){ while(i>=0&&!isAlpha(sF.charAt(i))) i--; while(j<sF.length()&&!isAlpha(sF.charAt(j))) j++; if(i==-1||j==sF.length()||!isEqual(sF, i, j)){ break; } counter++; i--; j++; } if(section[1]<counter){ section[0] = k; section[1] = counter; } counter = 0; } return section; } private static boolean isEqual(String sF, int i, int j) { int cal_tmp = Math.abs(sF.charAt(i) - sF.charAt(j)); if(cal_tmp==0||cal_tmp==32) return true; return false; } // modified dp longest common substring public static int[] dp_lcss(String a, String b) { long start = System.currentTimeMillis(); int[] section = new int[2]; if (a == null || b == null) { return section; } a = " " + a; b = " " + b; int[][] c = new int[a.length()][b.length()]; int interval_a = 1; int interval_b = 1; int cal_tmp = 0; for (int i = 1; i < a.length();i++) { if (!isAlpha(a.charAt(i))) { interval_a++; continue; } for (int j = 1; j < b.length();j++) { if (!isAlpha(b.charAt(j))) { interval_b++; continue; } cal_tmp = Math.abs(a.charAt(i) - b.charAt(j)); if (cal_tmp==0||cal_tmp==32) { c[i][j] = c[i - interval_a][j - interval_b] +1; if (c[i][j] > section[0]) { section[0] = c[i][j]; section[1] = j; } } interval_b = 1; } interval_a = 1; interval_b = 1; } long end = System.currentTimeMillis(); System.out.println(end-start); return section; } private static boolean isAlpha(char c) { // TODO Auto-generated method stub if ((c <= 'z' && c >= 'a') || (c <= 'Z' && c >= 'A')) { return true; } return false; } }
使用i和j记录每次进行回文判断到左右起点。方法输出回文字串的中心坐标以及回文长度。
时间复杂度:O(N^2)。最糟糕情况为:所有字符都相同,即最大回文字串为字符串本身。当已判断的回文子串长大于剩余字符串长度到两倍,可以停止搜索,以减少不必要操作。