Uva 123 快速查找

题:

For example, binary search provides a goodexample of an easy-to-understand algorithm with sub-linear complexity.Quicksort is an efficient O(n log n) [average case] comparison based sort.KWIC-indexing is an indexing method that permits efficient “human search” of,for example, a list of titles. Given a list of titles and a list of “words toignore”, you are to write a program that generates a KWIC (Key Word In Context)index of the titles. In a KWIC-index, a title is listed once for each keywordthat occurs in the title. The KWIC-index is alphabetized by keyword. Any wordthat is not one of the “words to ignore” is a potential keyword. For example,if words to ignore are “the, of, and, as, a” and the list of titles is: Descentof Man The Ascent of Man The Old Man and The Sea A Portrait of The Artist As aYoung Man A KWIC-index of these titles might be given by: a portrait of theARTIST as a young man the ASCENT of man DESCENT of man descent of MAN theascent of MAN the old MAN and the sea a portrait of the artist as a young MANthe OLD man and the sea a PORTRAIT of the artist as a young man the old man andthe SEA a portrait of the artist as a YOUNG man

 

The input is a sequence of lines, thestring ‘::’ is used to separate the list of words to ignore from

the list of titles. Each of the words toignore appears in lower-caseletters on a line by itself and is no

more than 10 characters in length. Each title appears on a line by itself and may consistof mixed-case

(upper and lower) letters. Words in a titleare separated by whitespace. No title contains more than

15 words.

There will be no more than 50 words to ignore, nomore than than 200 titles,and no more than

10,000 characters in the titles and words toignore combined. No characters other than ‘a’–‘z’, ‘A’–‘Z’,

and white space will appear in the input.

 

The output should be a KWIC-index of thetitles, with each title appearing once for each keyword in the title, and withthe KWIC-index alphabetizedby keyword. If a word appearsmore than once in atitle, each instance is a potential keyword. The keyword should appearin all upper-caseletters. All other words in a title should be in lowercase letters. Titles in the KWIC-index withthe same keyword should appear in the same order(稳定) as theyappeared in the input file. In the case where multiple instances of a word arekeywords in the sametitle, the keywords should be capitalized in left-to-right order. Case(upper or lower) is irrelevant when determining if a word is to be ignored. Thetitles in the KWIC-index need NOT be justified or aligned by keyword, alltitles may be listed

 

No title contains more than 15 words.

There will be no more than 200 titles, andno more than 10,000 characters in the titles



思路:

构造字典结构体:分隔出词,查找忽略词,不存在则加入词典

提取关键词小列表,指到原句。

关键词数组,整体排序,输出

找到每个单词对应的句子,循环输出到start,word,从end开始到结束

debug过程解决的问题:

忘了dicNum++

Temps,Word开得太小

strncpy末尾不会自动加'\0

忘了重定义typedef

编译器不支持strlwr

稳定的冒泡排序 j=1 ,j与j+1比。(j=i的排序不稳定)

输入的最后一个title无换行,影响关键词存储

有待改进处:

快排qsort

头文件<stdlib.h>

qsort函数声明如下:

void qsort(void * base,size_t nmemb,size_tsize ,int(*compar)(const void *,const void *));

参数说明:

base,要排序的数组

nmemb,数组中元素的数目

size,每个数组元素占用的内存空间,可使用sizeof函数获得

compar,指向函数的指针也即函数指针。这个函数用来比较两个数组元素,第一个参数大于,等于,小于第二个参数时,分别显示正值,零,负值。

能方便一点点,但是要重写compar,有点难记

int cmp( const void *_a , const void *_b )  
{  
    char *a = (char *)_a;  
    char *b = (char *)_b;  
    return strcmp( a , b );  
}  

提交AC的代码

#include<stdio.h>
#include<string.h>
#include<ctype.h>
//#define LOCAL
#define MAXN 3001
struct dictionary{
	char word[15];
	int titleNum;
	int wStart;
	int wEnd;
}dic[MAXN];
typedef struct dictionary di;

void swap(int l,int r){
	di tempdic;
	tempdic=dic[l];
	dic[l]=dic[r];
	dic[r]=tempdic;
}
char* strupr(char * str)
{
	char * orign=str;
	//process the string
	for ( ; *str != '\0'; str++ )
	*str = toupper(*str);
	return orign;//返回指向str的指针 
}
char* strlwr(char * str)
{
	char * orign=str;
	//process the string
	for ( ; *str != '\0'; str++ )
	*str = tolower(*str);
	return orign;//返回指向str的指针 
}
int main(){
#ifdef LOCAL
	freopen("data.in","r",stdin);
	freopen("data.out","w",stdout);
#endif
    char ignore[50+1][10+2] ;
    char title[200+2][15*10+2];
    char temps[12*15];
    int ignoreNum=0,titleNum=0,dicNum=0,wStart=0,wEnd=0,i,j;//wordNum为word首字母位置需要复位 
    int titleLen;
	while(scanf("%s",temps)&&strcmp(temps,"::")!=0){
		strcpy(ignore[ignoreNum++],temps);
//		memset(temps,0,sizeof(temps));	
	} 
	getchar();
	while(fgets(temps,sizeof(temps),stdin)!=NULL){
		titleLen=strlen(temps);
		if(temps[titleLen-1]!='\n'){
			temps[titleLen++]='\n';
		}
		strcpy(title[titleNum],strlwr(temps));
		
		for(i=0;i<titleLen;i++){
			if(isalpha(title[titleNum][i])) {
				wEnd++;//比当前值大1 

			}else{
				memset(temps,0,sizeof(temps));	
				strncpy(temps,title[titleNum]+wStart,wEnd-wStart);
				//在ignore中查找 
				for(j=0;j<ignoreNum;j++){
			        if(strcmp(temps,ignore[j])==0)break;
				}
				
				if(j>=ignoreNum){
					strcpy(dic[dicNum].word,strupr(temps));
					dic[dicNum].titleNum=titleNum;
					dic[dicNum].wStart=wStart;
					dic[dicNum].wEnd=wEnd;
					dicNum++;
				}
				wStart=i+1;
				wEnd=i+1;	
			}

		}
		titleNum++;
		wStart=0;
		wEnd=0;
		memset(temps,0,sizeof(temps));	
	}
	for(i=0;i<dicNum;i++){
		for(j=0;j<dicNum-1;j++) {
			if(strcmp(dic[j].word,dic[j+1].word)>0){
				swap(j,j+1);
			}
		}
	}

	for(i=0;i<dicNum;i++) {
		strcpy(temps,title[dic[i].titleNum]);
		titleLen=strlen(temps);
		for(j=0;j<dic[i].wStart;j++){
			printf("%c",temps[j]);
		}
		printf("%s",dic[i].word);
		for(j=dic[i].wEnd;j<titleLen;j++){
			printf("%c",temps[j]);
		} //大部分title自带了换行符
//		if(temps[j-1]!='\n') {
//			printf("\n");
//		}
	}
	return 0;
}




  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值