1. 提取网址的base address
$ url="http://www.flickr.com/search/?q=linux"
$ echo $url | egrep -o "https?://[a-z.]+"
http://www.flickr.com
$ echo $url | egrep -o "https?://[a-z\.]+"
http://www.flickr.com
dot (点)在 [] 里面不再表示任意一个字符,而是表示点本身。
2. 对网页里面的演员列表排序:
$ lynx -dump http://www.johntorres.net/BoxOfficefemaleList.html | grep -o "Rank-.*" | sed 's/Rank-//; s/\[[0-9]\+\]//' | sort -nk 1
1 Keira Knightley
2 Natalie Portman
3 Monica Bellucci
4 Bonnie Hunt
5 Cameron Diaz
6 Annie Potts
7 Liv Tyler
8 Julie Andrews
9 Lindsay Lohan
10 Catherine Zeta-Jones
11 Cate Blanchett
12 Sarah Michelle Gellar
13 Carrie Fisher
14 Shannon Elizabeth
15 Julia Roberts
16 Sally Field
17 Téa Leoni
18 Kirsten Dunst
19 Rene Russo
20 Jada Pinkett
21 Helen Hunt
22 Halle Berry
23 Kate Winslet
24 Margot Kidder
25 Elizabeth Perkins
26 Lucy Liu
27 Geena Davis
28 Rosie O'Donnell
29 Drew Barrymore
30 Sandra Bullock
31 Tia Carrere
32 Julia Stiles
33 Jane Fonda
34 Renée Zellweger
35 Demi Moore
36 Kathy Bates
37 Kate Beckinsale
38 Lea Thompson
39 Talia Shire
40 Queen Latifah
41 Denise Richards
42 Glenn Close
43 Meg Ryan
44 Whoopi Goldberg
45 Nicole Kidman
46 Jennifer Lopez
47 Jennifer Love Hewitt
48 Laura Dern
49 Mary Elizabeth Mastrantonio
50 Jennifer Aniston
51 Alicia Silverstone
52 Laura Linney
53 Elizabeth Hurley
54 Ashley Judd
55 Michelle Pfeiffer
56 Bette Midler
57 Diane Keaton
58 Sigourney Weaver
59 Jennifer Tilly
60 Jodie Foster
61 Courteney Cox
62 Angelina Jolie
63 Neve Campbell
64 Meryl Streep
65 Julianne Moore
66 Goldie Hawn
67 Linda Hamilton
68 Elisabeth Shue
69 Tara Reid
70 Kim Basinger
71 Annette Bening
72 Kristin Scott Thomas
73 Jeanne Tripplehorn
74 Rachel Weisz
75 Gwyneth Paltrow
76 Teri Garr
77 Jamie Lee Curtis
78 Nia Long
79 Madonna
80 Madeleine Stowe
81 Angela Bassett
82 Reese Witherspoon
83 Selma Blair
84 Kirstie Alley
85 Kathleen Quinlan
86 Susan Sarandon
87 Salma Hayek
88 Debra Winger
89 Winona Ryder
90 Charlize Theron
91 Valeria Golino
92 Sharon Stone
93 Jami Gertz
94 Christina Ricci
95 Marisa Tomei
96 Uma Thurman
97 Diane Lane
98 Jennifer Connelly
99 Nancy Travis
100 Heather Graham
101 Sophie Marceau
102 Jessica Lange
103 Kate Hudson
104 Andie MacDowell
105 Naomi Watts
106 Jennifer Jason Leigh
上面的命令也可以写成:
lynx -dump http://www.johntorres.net/BoxOfficefemaleList.html | grep -o "Rank-.*" | sed 's/Rank-//; s/[[0-9]\+]//' | sort -nk 1
3. 提取网页里面的title 信息:
$ echo "<title>Taiwan becomes the third Chinese special administration area following Hongkong and Macau</title><title>Chinese forces destroy Japan's most powerful fleet in South Pacific</title>" | > sed 's:</title>:&\n:' | sed 's:.*<title>\([^<]*\).*:\1:' Taiwan becomes the third Chinese special administration area following Hongkong and Macau Chinese forces destroy Japan's most powerful fleet in South Pacific
后面的 sed 里面的 \([^<]\)* 不能改成 \([^<]\).*
4. 提取 IP地址:
$ ifconfig wlan0 | egrep -o "inet addr:[^ ]*" | grep -o "[0-9.]*" 192.168.1.3