Find files

As we have wandered(徘徊,漫步) around our Linux system,one thing has become abundantly(丰富的,大量的,十分清楚的) clear: a typical Linux system has a lot of files! This begs the question,"how do we find things?" We already know that the Linux file system is well organized according to conventions(习俗,惯例) that have been passed down (传承) from one generation(同时代的人,一代人,一代) of Unix-like system to the next,but the sheer(完全地,十足地) number of files can present a daunting(使人畏缩的,使人气馁的)  problem.In this chapter,we will look at two tools that are used to find files on a system.These tools are :

locate-Find files by name

find-Search for files in a directory hierarchy(等级制度,统治集团,领导层) (目录层次结构)

We will also look at a command that is often used with file search commands to process the resulting list of files:

xargs-Build and execute command lines from standard input

In addition,we will introduce a couple of commands to assist us in or exploration:

touch - Change file times

stat - Display file or file system status

 

locate - Simple way to find files

The locate program performs a rapid database search of pathnames and outputs every name that matches a given substring.Say,for example,we want to find all the programs with names that begin with "zip." Since we are looking for programs,we can assume that the directory containing the programs would end with "bin/".Therefore,we could try to use locate this way to find our files:

locate will search its database of pathnames and output any that contain the string "bin/zip":

If the search requirement is not so simple,locate can be combined with other tools such as grep to design more interesting searches;

The locate program has been around for a number of years,and there are several different variants(变体,变种,变型) in common use.The two most common ones found in modern Linux distributions are slocate and mlocate,through they are usually accessed by a symbolic link named locate.The different versions of locate have overlapping(重叠,搭接)  options sets.Some versions include regular expression matching(which we'll cover in an  upcoming chapter) and wild card support.Check the man page for locate to determine which version of locate is installed.

Where Does The locate Database Come From?

You may notice that,on some distributions,locate fails to work just after the system is installed,but if you try again the next day,it works fine.What gives?The locate database is created by another program named updatedb.Usually,it is run periodically as a cron job;that is,a task performed at regular intervals by the cron daemon.Most systems equipped with locate run updatedb once a day.Since the database is not updated continuously,you will notice that very recent files do not show up when using locate.To overcome this,it's possible to run the updatedb program manually by becoming the superuser and running updatedb at the prompt.

 

find-Complex ways to find files

 While the locate program can find a file based solely on its name,the find program searches a given directory(and its subdirectories) for files based on a variety of attributes.We're going to spend a lot of time with find because it has a lot of interesting features that we will see again and again when we start to cover programming concepts in later chapters.

In its simplest use,find is given one or more names of directories to search.For example,to produce a list of our home directory:

On most active user accounts,this will produce a large list.Since the list is sent to standard output,we can pipe the list into other programs.Let's use wc to count the number of files;

 

Wow,we've been busy!The beauty of find is that is that it can be used to identify files that meet specific criteria(标准,条件).It does this through the (slightly strange) application of options,tests,and actions.We'll look at the test first.

Tests

Let's say that we want a list of directories from our search.To do this,we could add the following test:

Adding the test -type d limited the search to directories.Conversely(相反地,颠倒地),we could have limited the search to regular files with this test:

Here are the common file type tests supported by find:

Table 18-1:find File Types

File TypeDescription
bBlock special device file
cCharacter special device file
dDirectory
fRegular file
lSymbolic link

We can also search by file size and filename by adding some additional tests:Let's look for all the regular files that match the wild card pattern "*.JPG" and are larger than one megabyte(兆字节):

 

In this example,we add the -name test followed by the wild card pattern.Notice how we enclose it in quotes to prevent pathname expansion by the shell.Next ,we add the -size test followed by the string "+1M".The leading plus sign indicates that we are looking for files larger than the specified number.A leading minus(减法,减号) sign would change the meaning of the string to be smaller than the specified number.No sign means,"match the value exactly." The trailing letter "M" indicates that the unit of measurement(量度,尺寸) is megabytes.The following characters may be used to specify units:

 

Table 18-2:find Size Units

CharacterUnit
b512byte blocks.This is the default if no unit is specified.
cBytes 
wTwo byte words 
kKilobytes(Units of 1024 bytes) 
MMegabytes(Units of 1048576 bytes) 
GGigabytes(Units of 1073741824 bytes)

 find supports a large number of different tests.Below is a rundown  of  the common ones. Note that in cases where a numeric argument is required,the same "+"  and "-" notation discussed above can be applied:

Table 18-3:find Tests

TestDesciption
-cmin n Match files or directories whose content or attributes were last modified exactly n minutes ago.To specify less than n minutes ago,use -n and to specify more than n minutes ago,use +n.
-cnewer fileMatch files or directories whose contents or attributes were last modified more recently than those of file
-ctime nMatch files or directories whose contents or attributes were last modified n*24 hours age.
-emptyMatch empty files and directories.
-group nameMatch file or directories belonging to group.group may be expressed as either a group name or as a numeric group ID.
iname patternLike the -name test but case insensitive.
inum nMatch files with inode number n.This is helpful for finding all the hard links to a particular inode..
-mmin nMatch files or directories whose contents were modified n minutes ago.
-mtime nMatch files or directories whose contents were modified n*24 hours ago.
-name patternMatch files and directories with the specified wild card pattern.
-newer fileMatch files and directories whose contents were modified more recently than the specified file.This is very useful when writing shell scripts that perform file backups.Each time you make a backup,update a file(such as a log),then use find to determine which files that have changed since the last update.
-nouserMatch file and directories that do not belong to a valid user.This can be used to find files belonging to deleted accounts or to detect activity by attackers.
-nogroupMatch files and directories that do not belong to a valid group
-perm modeMatch files or directories that have permissions set to the specified mode.mode may be expressed by either octal or symbolic notation
-samefile nameSimilar to the -inum test.Matches files that share the same inode number as file name.
-size nMatch files of size n.
-type cMatch files of type c.
-user nameMatch files or directories belonging to user name.The user may be expressed by a user name or by a numeric user ID.

This is not a complete list.The find man page has all the details.

 

Operator

Even with all the tests that find provides,we may still need a better way to describe the logical relationships between the tests.For example,what if we needed to determine if all the files and subdirectories in a directory had secure permissions?We would look for all the files with permissions that are not 0600 and the directories with permissions that are not 0700.Fortunately,find provides a way to combine tests using logical operators to create more complex(复杂的,合成的) logical relationships.To express the aforementioned(前面提到的,上述的) test,we could do this:

Table 18-4:find Logical Operators

OperatorDescription
-andMatch if the tests on both sides of the operator are true.May be shortened to -a.Note that when no operator is present,-and is implied by default.
-orMatch if a test on either side of the operator is true.May be shortened to -o.
-not

Match if the test following the operator is false.May be abbreviated(简短的,小型的) with an exclamation(呼喊,惊叫) poing(!).

()

Groups tests and operators together to form larger expressions.This is used to control the precedence(领先于...的权利,优先权) of the logical evaluations(估价,评估).By default,find evaluates from left to right.It is often necessary to override(不顾,不理) the default evaluation order to obtain the desired result.Even if not needed,it is helpful sometimes to include the grouping characters to improve readability of the command.Note that since the parentheses characters have special meaning to the shell,they must be quoted when using them on the command line to allow them to be passed as arguments to find.Usually the backslash character is used to escape them.

把测试条件和操作符组合起来形成更大的表达式。这用来控制逻辑计算的优先级。默认情况,find命令按照从左到右的顺序计算。经常有必要重写默认的求值顺序,心得到期望的结果。即使没有必要,有时候包括组合起来的字符,对提高命令的可读性是很有帮助的。注意因为圆括号字符对于shell来説有特殊含义,所以在命令行中使用它们的时候,它们必须用引号引起来,才能作为实参传递给find命令,通常反斜杠字符被用来转义圆括号字符。

With this list of operators in hand,let's deconstruct(解构) our find command.When viewed(看) from the uppermost(最高的) level,we see that our tests are arranged as two groupings separated by an -or operator:

This makes sense,since we are searching for files with a certain set of permissions and for directories with a different set.If we are looking for both files and directories,why do we use-or instead of -and?Because as find scans(扫描,细看,浏览) through the files and directories,each one is evaluated(评价,估计,估价) to see if it matches the specified tests.We want to know if it is either a file with bad permissions or a directory with bad permissions.It can't be both at the same time.So if we expand the grouped expressions,we can see it this way:

Our next challenge is how to test for "bad permissions." How do we do that?Actually we don't.What we will test for is "not good permissions," since we know what "good permissions" are.In the case of files,we define good as 0600 and for directories,as

1.The expression that will test files for "not good" permissions is:

and for directories:

As noted in the table of operators above,the -and operator can be safely removed,since it is implied by default.So if we put this all back together,we get our final command:

However,since the parentheses have special meaning to the shell,we must escape them to prevent the shell from trying to interput them.Preceding each one with a backslash character does the trick.

There is another feature of logical operators that is important to understand.Let's say that we have two expressions separated by a logical operator:

In all cases,expr1 will always be performed;however the operator will determine if expr2 is performed.How's how it works:

Table 18-5:find AND/OR Logic

Results of expr1Operatorexpr2 is...
True-andAlways performed
False-andNever performed
True-orNever performed
False-orAlways performed

Why does this happen?It's done to improve performance.Take -and,for example.We know that the expression expr1 -and expr2 cannot be true if the result of expr1 is false,so there is no point in performing expr2.Likewise,if we have the expression expr1 -or expr2 and the result of expr1 is true,there is no point in performing expr2,as we already know that the expression expr1 -or expr2 is true.OK,so it helps it go faster.Why is this important?It's important because we can rely on this behavior to control how actions are performed,as we  shall soon see..

 

Predefined operation

Let's get some work done! Having a list of results from our find command is useful,but what we really want to do is act on the items on the list.Fortunately,find allows actions to be performed based on the search results.There are a set of predefined(预先确定) actions ad several ways to apply user-defined actions.First let's look at a few of the predefined actions:

Table 18-6:Predefined find Actions

ActionDescription
-deleteDelete the currently matching file.删除当前匹配的文件
-lsPerform the equivalent(相等的,相当的) of ls-dils on the matching file.Output is sent to standard output.对匹配的文件执行等同的ls -dils命令。并将结果发送到标准输出
-printOutput the full pathname of the matching file to standard output.This is the default action if no other action is specified.把匹配文件的全路径名输送到标准输出。如果没有指定其它操作,这是默认操作。
-quitQuit once a match has been made. 一旦找到一个匹配,退出

As with the tests,there are many more actions.See the find man page for full details.In our very first example,we did this:

which produced a list of every file and subdirectory contained within our home directory.It produced a list because the -print action is implied if no other action is specified.Thus our command could also be expressed as:

We can use find to deletes files tat meet certain criteria(标准).For example,to delete files that have the file extension ".BAK" (which is often used to designate(指派,委任,标明) backup files),we could use this command:

In this example,every file in the user's home directory(and its subdirectories) is searched for filenames ending in .BAK.When they are found,they are deleted.

Warning:It should go without saying that you should use extreme caution when using the -delete action.Always test the command first substituting(取代) the -print action for -delete to confirm the search results.

Before we go on,let's take another look at how the logical operators affect actions.Consider the following command:

As we have seen,this command will look for every regular file(-type f) whose name ends with .BAK(-name '*.BAK') and will output the relative pathname of each matching file to standard output(-print).However,the reason the command performs the way it does is determined by the logical relationships between each of the test and actions.Remember,there is,by default,an implied-and relationship between each test and action.We could also express the command this way to make the logical relationships easier to see:

Warning:It should go without saying that you should use extreme caution when using the -delete action.Always test the command first by substituting(取代)the -print action for -delete to confirm the search results.

 Before we go on,let's take another look at how the logical operators affect actions.Consider the following command:

With our command fully expressed,let's look at how the logical operators affect its execution:

Test/Actionls Performed Only If...
-print-type f and  -name '*.BAK' are true   打印-type f 与-name '*.BAK为真的时候'
-name '*。BAK'-type f is true
-type fls always performed,since it is the first test/action in an -and relationship.

Since the logical relationship between the tests and actions determines which of them are performed,we can see that the order of the tests and actions is important.For instance,if we were to reorder the tests and actions so that the -print acion was the first one,the command would behave much differently:

 

This version of the command will print each file (the -print action always evaluates(评价,评估) to true) and then test for file type and the specified file extension.

 

User defined behavior(用户定义的行为)

In additon to the predefined actions,we can also invoke arbitrary(随意的,主观的) commands.The traditional way of doing this with the -exec action.This action works like this:

 

 

where command is the name of a command,{}is a symbolic represention of the current pathname and the semicolon(分号) is a required delimiter(定界符,分界符) indicating the end of the command.Here's an example of using -exec to act like the -delete action discussed earlier:

Again,since the brace(支住,撑牢,使绷紧,,,,这里指花括号) and semicolon(分号) characters have special meaning to the shell,they must be quoted or escaped.

It's also possible to execute a user defined action interactively.By using the -ok action in place of -exec,the user is prompted before execution of each specified command:

In this example,we search for files with names staring with the string "an" and execute the command ls -l each time one is found.Using the -ok action prompts the user before the ls command is executed.

 

 Increase of effeciency(提高效率)

When the -exec action is used,it launches a new instance of the specified command each time a matching file is found.There are times when we might prefer to combine all of the search results and launch a single instance of the command.For example,rather than executing the command like this:

we may prefer to execute it this way:

thus causing the command to be executed only one time rather than multiple times.There are two ways we can do this.The traditional way,using the external(外部的) command xargs and the alternate(轮流的,交替的) way,using a new feature in find itself.We'll talk about the alternate(轮流的,交替的) way first.

By changing the trailing semicolon character to a plus sign,we activate the ability of find to combine the results of the search into a argument list for a single execution of the desired command.Going back to our example,this:

will execute ls each time a matching file is found.By changing the command to:

 

we get the same results,but the system only has to execute the ls command once.

 xargs

The xargs command performs an interesting function.It accepts input from standard input and converts(使转变,使转化) it into an argument list for a specified command.With our example,we would use it like this:

Here we see the output of the find command piped into xargs which,in turn,constructs an argument list for ls command and then executes it.

Note:While the number of arguments that can be placed into a command line is quite large,it's not unlimited.It is possible to create commands that are too long for the shell to accept.When a command line exceeds(超过,超越) the maximum length supported by the system,xargs executes the specified command with the  maximum number of arguments possible and then repeats this process until standard input is exhausted(耗尽的,用完的).To see the maximum size of the command line,execute xargs with the --show-limits option.

 

Dealing With Funny Filenames

Unix-like systems allow embeded spaces(and even newlines!) in filenames.This causes problem for programs like xargs that construct argument lists for other programs.An embeded space will be treated as a delimiter and the resulting command will interpret each space-separated word as a separate argument.To overcome(战胜,克服) this,find and xarg allow the optional use of a null character argument separator.A null character is defined in ASCII as the character represented by the number zero(as opposed to,for example,the space character,which is defined in ASCII as the character represented by the number 32).The find command provides the  action -print0,which produces null separated output,and the xargs command has the -null option,which accepts null separated input.Here's an example:

Using this technique,we can ensure that all files,even those containing embedded spaces in their names,are handled correctly.

 

Return to playground 

It's time to put find to some(almost) practical use.We'll create a playground and try out some of what we have learned.

First,let's create a playground with lots of subdirectories and files:

 

Marvel(奇迹,对...感到惊讶) in the power of the command line!With these two lines,we created a playground directory containing one hundred subdirectories each containing twenty-six empty files.Try that with the GUI!

The method we employed to accomplish(完成,实现) this magic involved a familar command(mkdir),an exotic(奇异的,醒目的) shell expansion(braces) and a new command,touch.By combining mkdir with the -p option (which causes mkdir to create the parent directories of the specified paths)with brace expansion,we were able to create one hundred directories.

The touch command is usually used to set or update the acess,chang,and modify times of files.However,if a filename argument is that of a nonexistent(不存在的) file,an empty file is created.

In our playground,we created one hundred instances of a file named file-A.Let's find them:

Next,let's look at finding files based on their modification times.This will be helpful when creating backups or organizing files in chronological order.To do this,we will first create a reference file against which we will compare modification time:

This creates an empty file named timestamp and sets its modification time to the current time.We can verify this by using another handy command,stat,which is a kind of souped-up(提高效率) version of ls.The stat command reveals all that the system understands about a file and its attributes:

If we touch the file again and then examine it with stat,we will see that the file's time have been updated.

Next,le's use find to update some of our playground  files:

This updates all files in the playground named file-B.Next we'll use find to identify the updated files by comparing all the files to the reference file timestamp:

The results contain all one hundred instances of file-B.Since we performed a touch on all the files in the playground name file-B after we updated timestamp,they are now "newer" than timestamp and thus can be identified with the -newer test.

Finally,let's go back to the bad permissions test we performed earlier and apply it to playground:

This command lists all one hundred directories and twenty-six hundred files in playground(as well as timestamp and playground itself,for a total of 2702) because none of them meets our defination of "good permissions." With our knowledge of operators and actions,we can add actions to this command to apply new permissions to the files and directories in our playground:

On a day-to-day basis(在日常的基础上),we might find it easier to issue two commands,one for the directories and one for the files,rather than this one large compound(混合,合成),but it's nice to know that we can do it this way.The important point here is to understand how the operators and actions can be used together to perform useful tasks.

Finally,we have the options.The options are used to control the scope(视野,范围) of a find search.They may be included with other tests and actions when constructing(构造)find expressions.Here is a list of the most commonly used ones:

Table 18-7:find Options

OptionDescription
-depthDirect find to process a directory's files before the directory itself.This option is automaticallly applied when the -delete action is specified.
-maxdepth levelsSet the maximum number of levels that find will descend(下降) into a directory tree when performing tests and actions.当执行测试条件和行为的时候,设置find程序陷入目录树的最大级别数。maybe陷入/目录
-minidepth levelsSet the minimum number of levels that find will descend into a directory tree before applying tests and actions.
-mountDirect find not to traverse directories that are mounted on other file systems.指导find程序不要搜索挂载到其它文件系统上的目录
-noleafDirect find not to optimize(使最优化,使完善) its search based on the assumption that it is 指导find程序不要基于搜索类Unix的文件系统做出的假设,来优化它的搜索。

转载于:https://www.cnblogs.com/itmeatball/p/7629038.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值