之前用ubunut大概有一个学期吧,尝试做过网站,中间遇到问题都是依靠搜索引擎,觉得还是有必要系统的学习Unix/Linux。尝试了两章Advanced Programming in the Unix Environment,感觉还是有点吃力,之前有看过《鸟叔的Linux私房菜》觉得不是我的菜。直到发现The Linux Command Line,嗯,TLCL是我的菜。
一些重要的话
The ~/bin directory is a good place to put scripts intended for personal use.If we write a script that everyone on a system is allowed to use, the traditional location is /usr/local/bin. Scripts intended for using by the system administrator are often located in /usr/local/sbin. In most cases, locally supplied software, whether scripts or compiled programs, should be placed in the /usr/local hierarch y and not in /bin or /usr/bin. These directories are specified by the Linux Firesystem Hierarchy Standard to contain only files supplied and maintained by the Linux distributor.
One of the key goals of serious script writing is ease of maintenance; that is, the ease with which a script may be modified by its author or others to adapt it to changing needs. Making a script easy to read and understand is one way to facilitate easy maintenance.
Programs are usually built up in a series of stages, with each stages adding features and capabilities.
When writing programs, its always a good idea to strive for simplicity and clarity. Maintenance is easier whe-n program is easy to read and understand, not to mention, it can make the program easier to write by reducing the amount of typing.
This process of identifying the top-level steps and developing increasingly detailed views of those steps is called top-down design. This technique allows us to break complex tasks into many small, simple tasks.
While developing our program, it is useful to keep the program in a runnable state. By doing this, and testing frequently, we can detect errors early in the development process.
Very often the difference between a well-written program and a poorly written one is in the programs ability to deal with the unexpected.
The presence of multiple ‘exit’ points in a program is generally a bad idea.
一些重要的事实
E-mail is an intrinsically text-based medium. Even non-text attachments are converted into a text representation for transmission.Unix ends a line with a linefeed character (ASCII 10) while MS-DOS and its derivations use sequence carriage return (ASCII 13) and linefeed to terminate each line of text.
If we apply database terminology to the table above, we would say that each row is a record consists of multiple fields.
sort allows multiple instances of the -k option so that multiple sort keys can be specified.
An option letters are the same as the global options for the sort program: b (ignore leading blocks), n(numeric sort), r(reverse sort), and so on.
The key option allow specification of offsets within fields, so we can define keys within fields.
Compare to sort, the uniq program is a lightweight . uniq performs a seemingly trivial task.
When given a sorted file, it removes any duplicate lines and sends the results to standard output. uniq only removes duplicate lines which are adjacent to each other.
The cut program is used to extract a section of text from a line and output the extracted section to standard ouput. cut is best used to extract text from files that are produced by other programs.
The paste does the opposite of cut. Rather than extracting a column of text from a file, it adds one or more columns of text to a file.
A join is an operation usually associated with relational databases where data from multiple tables with a shared key field is combined to form a desired result.
It is important to point that the files must be sorted on the key fieled for join to work properly.
comm produces three columns of output. The first column contains lines unique to the first file argument; the second column, the lines unique to the second file argument; the third column contains the line shared by both file.
diff is used to detect the difference between files. diff is often used by software developers to examine changes between different versions of program source code, and hence has the ability to recursively examine directories of source code often referred to as source trees.
The patch is used with diff.
diff -Naur old_file new_file > diff_file
patch < diff_fileROT13: not so secret decoder ring
sed is a stream editor. It perform text editing on a stream of text, either a set of specified files or standard input. It is most often used for simple one line tasks rather than long scripts.
There are a few more interesting text manipulation commands worth investigating. Among them are: split (split files into pieces), csplit (split files into pieces based on context), and sdiff (side-by-side merge of file differences).
Printing on Unix-like systems goes way back to the beginning of the operating system itself.
A 300-dot-per-inch (DPI) laser printer (assuming an 8-by-10-inch print area per page) requires (8*300)*(10*300)/8 = 900000bytes.
A clever invention was needed, that invention turned out to be the page-description language. PDL
As the years went by, both computers and networks became much faster. This allowed the RIP to move from the printer to the host computer, which, in turn, permitted high-quality printers to be much less expensive.
Modern Linux systems employ two software suites to perform and manage printing. The first, CUPS (Common Unix Printing System), provides print drivers and print-job management; the second, Ghostscript, a PostScript interpreter, acts as a RIP.
To see the status of a printer queue, the lpq program is used. This allows us to view the status of the queue the print jobs it contains.
Programs written in assembly language are processed into machine language by a program called assembler.
Providing support for common tasks is accomplished by what are called libraries. If we look in the /lib and /usr/lib directories, we can see where many of them live.
A program called linker is used to form the connections between the ouput of the compiler and the libraries that the compiled program requires. The final result of this process is the executable program file, ready for use.
Scripted languages are executed by a special program called an interpreter. An interpreter inputs the program file and reads and executes each instruction contained within it.
So why are interpreted languages so popular? For many programming chores, the results are “faster enough,” but the real advantage is that it is generally faster and easier to deployed interpreted programs than compiled programs.
Since we are the “maintainer” of this source code while we compile it, we will keep it in ~/src. Source code installed by your distribution will be installed in /usr/src, while source code intended for use by INSTALL files before attempting to build the program.
/usr/include
They are supplied by the system to support the compilation of every program.
The header files in this directory were installed when we installed the complier.Most programs build with a simple, two-command sequence:
./configure
makeThe configure program is a shell script which is supplied with the source tree. Its job is to analyze the build environment.
We see configure created several new files in our source directory. The most important one is Makefile. Makefile is a configuration file that instructs the make program exactly how to build the program. Without it, make will refuse to run.
The ability of make to intelligently build only what needs building is a great benefit to programmers.
Well-packaged source code will often include a special make target called install. This target will install the final product in a system directory for use.
Usually, this directory is /usr/local/bin, the traditional location for locally built software.
./configure
make
make installThe make program can be used for any task that need to maintain a target/dependency relationship, not just for compiling source code.
To successfully create and run a shell script, we need to do three things:
- write a script 2. make the script executable 3. put the script somewhere the shell can find it
The shebang(#!) is used to tell the system the name of the interpreter that should be used to execute the script that follows.
The list of directories is held within an environment variable named PATH. The PATH variable contains a colon-separated list of directories to be searched.
The dot(.) command is a synonym for the source command, a shell builtin which reads a specified file of shell commands and treats it like input from the keyboard.
In the interest of reducing typing, short options are preferred when entering options on the command line, but when writing scripts, long options can provide improved readability.
By using line continuations and indentation, the logic of this complex command is more clearly described to the reader.
The shell makes no distinction between variables and constants. A common convention is to use uppercase letters to designate constants and lowercase letters for true variables.
Unlike some other programming languages, the shell does not care about the type of data assigned to a variable; it treats them all as strings.
A here document is an additional form of I/O redirection in which we embed a body of text into our script and feed it into the standard input of a command.
command << tokenIf we change the redirection operator from “<<” to “<<-“, the shell will ignore leading tab character in the here document. This allows a here document to be intended, which can improve readability.
The shell provides a parameter $? that we can use to examine the exit status.
What the if statement really does is evaluate the success or failure of commands.
[[expression]] acts as an enhanced replacement for test regular expression
In addition to the [[]] compound command, bash also provides the (()) compound command, which is useful for operating on integers. It supports a full set of arithmetic evaluations.
Because the compound command (()) is part of the shell syntax rather than an ordinary command, and it deals only with integers, it is able to recognize variables by name and does not require expansion to be performed.
The test [] and [[]] do roughly the same thing, which is preferable? Test is traditional (and part of POSIX), wheres [[]] is special to bash.
[-d temp] || mkdir tempThe read builtin command is used to read a single line of standard input. If no variable are listed after the read command, a shell variable, REPLY, will be assigned all the input.
You cant pipe read.
In bash, pipeline create subshells. These are copies of the shell and its environment which are used to execute the command in the pipeline.
A subshell can never alter the environment of its parent process.
It would be better if we could somehow construct the program so that it could repeat the menu display and selection over and over, until the user choose to exit the program.
Bash provides tow builtin commands that can be used to control program flow inside loops. The break command and continue command.
Since true will always exit with a exit status of zero, the loop will never end. This is a surprisingly common scripting technique.
While and until can process standard input. This allows files to be processed with while and until loops.
一些命令
cat: concatenate files and print on the standard output
sort: sort lines of text files
uniq: report or omit repeated lines
cut: remove sections from each line of files
paste: merge lines of files
join: join lines of two files on a command field
comm: compare two sorted files line by line
diff: compare lines line by line
patch: apply a diff file to an original
tr: translate or delete characters
sed: stream editor for filtering and transforming text
aspell: interactive specll checker
nl: number lines
fold: wrap each line to a specified length
fmt: a simple text formatter
pr: prepare text for printing
printf: format and print data
groff: a document formatting system
pr: convert text files for printing
lpr: print files
lp: printf
a2ps: format files for printing on a PostScript printer
lpstat: show printer status information
lpq: show printer queue status
lprm: cancel print job
cancel: cancel print job (system V)
make: utility to maintain programs