# PERL常见问题解答--FAQ(4)--Data: Strings

### How do I validate input?

The answer to this question is usually a regular expression, perhaps with auxiliary logic. See the more specific questions (numbers, email addresses, etc.) for details.

### How do I unescape a string?

It depends just what you mean by escape''. URL escapes are dealt with in the perlfaq9 manpage. Shell escapes with the backslash (/) character are removed with:



### How do I expand function calls in a string?

This is documented in the perlref manpage. In general, this is fraught with quoting and readability problems, but it is possible. To interpolate a subroutine call (in a list context) into a string:

    print "My sub returned @{[mysub(1,2,3)]} that time./n";


If you prefer scalar context, similar chicanery is also useful for arbitrary expressions:

    print "That yields ${/($n + 5)} widgets/n";


See also How can I expand variables in text strings?'' in this section of the FAQ.

This isn't something that can be tackled in one regular expression, no matter how complicated. To find something between two single characters, a pattern like /x([^x]*)x/ will get the intervening bits in $1. For multiple ones, then something more like /alpha(.*?)omega/ would be needed. But none of these deals with nested patterns, nor can they. For that you'll have to write a parser. ### How do I reverse a string? Use reverse() in a scalar context, as documented in reverse. $reversed = reverse $string;  ### How do I expand tabs in a string? You can do it the old-fashioned way:  1 while$string =~ s//t+/' ' x (length(___FCKpd___5amp;) * 8 - length(
Although those with a regexp kind of thought process will likely prefer

$a =~ s/^.../Tom/; How do I change the Nth occurrence of something? You have to keep track. For example, let's say you want to change the fifth occurrence of whoever'' or whomever'' into whosoever'' or whomsoever'', case insensitively.$count = 0;
s{((whom?)ever)}{
++$count == 5 # is it the 5th? ? "${2}soever"      # yes, swap
: $1 # renege and leave it there }igex; How can I count the number of occurrences of a substring within a string? There are a number of ways, with varying efficiency: If you want a count of a certain single character (X) within a string, you can use the tr/// function like so:$string = "ThisXlineXhasXsomeXx'sXinXit":
$count = ($string =~ tr/X//);
print "There are $count X charcters in the string"; This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string, tr/// won't work. What you can do is wrap a while() loop around a global pattern match. For example, let's count negative integers:$string = "-9 55 48 -2 23 -76 4 14 -44";
while ($string =~ /-/d+/g) {$count++ }
print "There are $count negative numbers in the string"; How do I capitalize all the words on one line? To make the first letter of each word upper case:$line =~ s//b(/w)//U$1/g; This has the strange effect of turning don't do it'' into Don'T Do It''. Sometimes you might want this, instead (Suggested by Brian Foy <comdog@computerdog.com>):$string =~ s/ (
(^/w)    #at the beginning of the line
|      # or
(/s/w)   #preceded by whitespace
)
//U$1/xg;$string =~ /([/w']+)//u/L$1/g; To make the whole line upper case:$line = uc($line); To force each word to be lower case, with the first letter upper case:$line =~ s/(/w+)//u/L$1/g; How can I split a [character] delimited string except when inside [character]? (Comma-separated files) Take the example case of trying to split a string that is comma-separated into its different fields. (We'll pretend you said comma-separated, not comma-delimited, which is different and almost never what you mean.) You can't use split(/,/) because you shouldn't split if the comma is inside quotes. For example, take a data line like this: SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped" Due to the restriction of the quotes, this is a fairly complex problem. Thankfully, we have Jeffrey Friedl, author of a highly recommended book on regular expressions, to handle these for us. He suggests (assuming your string is contained in$text):

@new = ();
push(@new, ___FCKpd___19) while $text =~ m{ "([^/"//]*(?://.[^/"//]*)*)",? # groups the phrase inside the quotes | ([^,]+),? | , }gx; push(@new, undef) if substr($text,-1,1) eq ',';

If you want to represent quotation marks inside a quotation-mark-delimited field, escape them with backslashes (eg, C<like /''this/''). Unescaping them is a task addressed earlier in this section.
Alternatively, the Text::ParseWords module (part of the standard perl distribution) lets you say:

use Text::ParseWords;
@new = quotewords(",", 0, $text); How do I strip blank space from the beginning/end of a string? The simplest approach, albeit not the fastest, is probably like this:$string =~ s/^/s*(.*?)/s*$/$1/;

It would be faster to do this in two steps:

$string =~ s/^/s+//;$string =~ s//s+$//; Or more nicely written as: for ($string) {
s/^/s+//;
s//s+$//; } How do I extract selected columns from a string? Use substr() or unpack(), both documented in the perlfunc manpage. How do I find the soundex value of a string? Use the standard Text::Soundex module distributed with perl. How can I expand variables in text strings? Let's assume that you have a string like:$text = 'this has a $foo in it and a$bar';
$text =~ s//$(/w+)/${$1}/g;

Before version 5 of perl, this had to be done with a double-eval substitution:

$text =~ s/(/$/w+)/$1/eeg; Which is bizarre enough that you'll probably actually need an EEG afterwards. :-) See also How do I expand function calls in a string?'' in this section of the FAQ. What's wrong with always quoting "$vars"?
The problem is that those double-quotes force stringification, coercing numbers and references into strings, even when you don't want them to be.
If you get used to writing odd things like these:

print "$var"; # BAD$new = "$old"; # BAD somefunc("$var");   # BAD

You'll be in trouble. Those should (in 99.8% of the cases) be the simpler and more direct:

print $var;$new = $old; somefunc($var);

Otherwise, besides slowing you down, you're going to break code when the thing in the scalar is actually neither a string nor a number, but a reference:

func(/@array);
sub func {
my $aref = shift; my$oref = "\$aref";  # WRONG
}

You can also get into subtle problems on those few operations in Perl that actually do care about the difference between a string and a number, such as the magical ++ autoincrement operator or the syscall() function.

Why don't my <There must be no space after the << part.
Check for these three things:

There (probably) should be a semicolon at the end.
You can't (easily) have any space in front of the tag.


