LeetCode problems下的shell题目(截止发文日期,共有4个shell problem)主要是考察对文本处理命令的熟练程度、具体应用,也涉及到正则表达式的一些知识。
1.Tenth Line
Description: How would you print just the 10th line of a file? For example, assume that file.txt has the following content.Your script should output the tenth line, which is:
Line 10
Line 1
Line 2
Line 3
Line 4
Line 5
Line 6
Line 7
Line 8
Line 9
Line 10Analysis: 就题目本身来说,首先需要考虑文本文件少于10行的情况,然后才是考虑文件中有不少于10行的数据如何输出数据。主要考察怎么得到文本的行数,和输出指定行数的内容。
- Solution: [ `cat file.txt|wc -l` -ge 10 ] && sed -n ‘10p’ file.txt
2.Valid Phone Numbers
Description: Given a text file file.txt that contains list of phone numbers (one per line), write a one liner bash script to print all valid phone numbers.
You may assume that a valid phone number must appear in one of the following two formats: (xxx) xxx-xxxx or xxx-xxx-xxxx. (x means a digit)
You may also assume each line in the text file must not contain leading or trailing white spaces.
For example, assume that file.txt has the following content:987-123-4567
123 456 7890
(123) 456-7890Your script should output the following valid phone numbers:
987-123-4567
(123) 456-7890Analysis: 此题就是考察正则表达式的应用,利用grep命令获取匹配正则表达式的行。当然也可以使用sed,awk命令。
Solution:
sed -nr '/^(\\([0-9]{3}\\) |[0-9]{3}-)[0-9]{3}-[0-9]{4}$/p' file.txt
3.Transpose File
Description: Given a text file file.txt, transpose its content.
You may assume that each row has the same number of columns and each field is separated by the ’ ’ character.
For example, if file.txt has the following content:name age
alice 21
ryan 30Output the following:
name alice ryan
age 21 30Analysis: 题目表达得很明确,我们把输入抽象为一个矩阵模型,简单来说就是求其转置矩阵。在这种需求下,我选用awk工具来实现。awk处理文本的能力十分强大,一方面它支持文本的字段处理,另一方面用户还能利用内建变量、函数和自定义函数来编程实现相应的文本操作。
Solution:
awk -F' ' '
{{for(i=1;i<=NF;i=i+1){array[NR,i]=$i}}}
END{
for(i=1;i<=NF;i++){
for(j=1;j<=NR;j++){
printf("%s",array[j,i]);
if(j<NR){printf(" ")}
};
printf("\n")
}
}' file.txt
4.Word Frequency
Description: Write a bash script to calculate the frequency of each word in a text file words.txt.
For simplicity sake, you may assume:
words.txt contains only lowercase characters and space ’ ’ characters.
Each word must consist of lowercase characters only.
Words are separated by one or more whitespace characters.
For example, assume that words.txt has the following content:the day is sunny the the
the sunny is isYour script should output the following, sorted by descending frequency:
the 4
is 3
sunny 2
day 1Analysis: 很自然地想到先将句子拆分为一个个单词,然后再执行统计操作。所以我们先要使用awk工具来做文本的字段处理;为了让uniq命令能够统计所有行中的单词,先用sort命令将输出排序(这里不用sort命令,uniq统计的结果是针对每一行的,具体原因我也不清楚,望知道的看官评论或私信告知一下,THX!);接着按要求再用一次sort命令按词频降序排列;最后用awk命令来格式化一下输出,满足题目的格式要求。
Solution:
awk -F' ' '{for(i=1;i<=NF;i=i+1){print $i}}' words.txt |sort|uniq -c|sort -nr|awk -F' ' '{printf("%s %s\n",$2,$1)}'
5.总结
这几个题目主要还是考察linux文本处理命令的使用和正则表达式的基础知识。我给出的题目解答不一定是最优的,仅供参考,有更优解答的各位看官,可以评论留言或者私信交流。:)