作者:zhanhailiang 日期:2012-12-14
在格式化文本中查找缩写词相应的完整词语,如输入“Basic”,返回其全称“Beginner's All-Purpose Symbolic Instruction Code”
下面这个缩写词列表可理解为一个简单的数据库:
zhanhailiang@linux-06bq:~> cat acronyms BASIC Beginner's All-Purpose Symbolic Instruction Code CICS Customer Information Control System COBOL Common Business Oriented Language DBMS Data Base Management System GIGO Garbage In, Garbage Out GIRL Generalized Information Retrieval Language
编写acro脚本,它从命令行中获取第一个参数(首字母缩写词的名字)并将其传递给awk脚本,如下
zhanhailiang@linux-06bq:~> cat acro #!/bin/sh awk 'BEGIN {FS="\t";}; tolower($1) == tolower(search) {print $2;}' search=$1 acronyms
在shell命令行中的第一个参数($1)被赋值给变量search,这个变量作为参数传递给awk程序。下面演示如何使用这个程序在列表中找到特殊的首字母缩写词(不区分大小写)。
zhanhailiang@linux-06bq:~> ./acro Gigo Garbage In, Garbage Out
使用关联数组来实现数据的检索:
#!/bin/sh #awk 'BEGIN {FS="\t";}; tolower($1) == tolower(search) {print $2;}' search=$1 acronyms awk 'BEGIN { FS="\t"; search = tolower(search); }; { array[tolower($1)] = $2; } END { if(search in array)- print array[search]; }' search=$1 acronyms
接下来介绍一种更复杂的检索方法(同时交互性更强):
zhanhailiang@linux-06bq:~> cat glossary BASIC Beginner's All-Purpose Symbolic Instruction Code CICS Customer Information Control System COBOL Common Business Oriented Language DBMS Data Base Management System GIGO Garbage In, Garbage Out GIRL Generalized Information Retrieval Language zhanhailiang@linux-06bq:~> cat lookup awk ' BEGIN { FS = "\t"; OFS = "\t"; printf("Enter a glossary term: "); }; FILENAME == "glossary" { entry[tolower($1)] = $2; next; }; tolower($0) ~ /^(quit|q|exit|x)$/ { exit; }; $0 != "" { if(tolower($0) in entry) { print entry[tolower($0)]; } else { print $0 " not found."; } }; { printf("Enter another glossary term (q to quit)"); }; ' glossary - # 从标准输入流中读取缩写,查看glossary中是否有相应的匹配