How do I validate input?
The answer to this question is usually a regular expression, perhaps with auxiliary logic. See the more specific questions (numbers, email addresses, etc.) for details.
How do I unescape a string?
It depends just what you mean by ``escape''. URL escapes are dealt with in the perlfaq9 manpage. Shell escapes with the backslash (/) character are removed with:
s///(.)/$1/g;
Note that this won't expand /n or /t or any other special escapes.
How do I remove consecutive pairs of characters?
To turn ``abbcccd'' into ``abccd'':
s/(.)/1/$1/g;
How do I expand function calls in a string?
This is documented in the perlref manpage. In general, this is fraught with quoting and readability problems, but it is possible. To interpolate a subroutine call (in a list context) into a string:
print "My sub returned @{[mysub(1,2,3)]} that time./n";
If you prefer scalar context, similar chicanery is also useful for arbitrary expressions:
print "That yields ${/($n + 5)} widgets/n";
See also ``How can I expand variables in text strings?'' in this section of the FAQ.
How do I find matching/nesting anything?
This isn't something that can be tackled in one regular expression, no matter how complicated. To find something between two single characters, a pattern like/x([^x]*)x/
will get the intervening bits in $1. For multiple ones, then something more like
/alpha(.*?)omega/
would be needed. But none of these deals with nested patterns, nor can they. For that you'll have to write a parser.
How do I reverse a string?
Usereverse()
in a scalar context, as documented in
reverse.
$reversed = reverse $string;
How do I expand tabs in a string?
You can do it the old-fashioned way:
1 while $string =~ s//t+/' ' x (length(___FCKpd___5amp;) * 8 - length(
How do I validate input?
The answer to this question is usually a regular expression, perhaps with auxiliary logic. See the more specific questions (numbers, email addresses, etc.) for details.
How do I unescape a string?
It depends just what you mean by ``escape''. URL escapes are dealt with in the perlfaq9 manpage. Shell escapes with the backslash (/) character are removed with:
s///(.)/$1/g;Note that this won't expand /n or /t or any other special escapes.
How do I remove consecutive pairs of characters?
To turn ``abbcccd'' into ``abccd'':
s/(.)/1/$1/g;
How do I expand function calls in a string?
This is documented in the perlref manpage. In general, this is fraught with quoting and readability problems, but it is possible. To interpolate a subroutine call (in a list context) into a string:
print "My sub returned @{[mysub(1,2,3)]} that time./n";If you prefer scalar context, similar chicanery is also useful for arbitrary expressions:
print "That yields ${/($n + 5)} widgets/n";See also ``How can I expand variables in text strings?'' in this section of the FAQ.
How do I find matching/nesting anything?
This isn't something that can be tackled in one regular expression, no matter how complicated. To find something between two single characters, a pattern like/x([^x]*)x/
will get the intervening bits in $1. For multiple ones, then something more like/alpha(.*?)omega/
would be needed. But none of these deals with nested patterns, nor can they. For that you'll have to write a parser.
How do I reverse a string?
Usereverse()
in a scalar context, as documented in reverse.
$reversed = reverse $string;
How do I expand tabs in a string?
You can do it the old-fashioned way:
) % 8)/e;Or you can just use the Text::Tabs module (part of the standard perl distribution).
use Text::Tabs; @expanded_lines = expand(@lines_with_tabs);
How do I reformat a paragraph?
Use Text::Wrap (part of the standard perl distribution):
use Text::Wrap; print wrap("/t", ' ', @paragraphs);The paragraphs you give to Text::Wrap may not contain embedded newlines. Text::Wrap doesn't justify the lines (flush-right).
How can I access/change the first N letters of a string?
There are many ways. If you just want to grab a copy, use substr:
$first_byte = substr($a, 0, 1);If you want to modify part of a string, the simplest way is often to use
substr()
as an lvalue:
substr($a, 0, 3) = "Tom";Although those with a regexp kind of thought process will likely prefer
$a =~ s/^.../Tom/;
How do I change the Nth occurrence of something?
You have to keep track. For example, let's say you want to change the fifth occurrence of ``whoever'' or ``whomever'' into ``whosoever'' or ``whomsoever'', case insensitively.
$count = 0; s{((whom?)ever)}{ ++$count == 5 # is it the 5th? ? "${2}soever" # yes, swap : $1 # renege and leave it there }igex;
How can I count the number of occurrences of a substring within a string?
There are a number of ways, with varying efficiency: If you want a count of a certain single character (X) within a string, you can use thetr///
function like so:
$string = "ThisXlineXhasXsomeXx'sXinXit": $count = ($string =~ tr/X//); print "There are $count X charcters in the string";This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a larger string,
tr///
won't work. What you can do is wrap awhile()
loop around a global pattern match. For example, let's count negative integers:
$string = "-9 55 48 -2 23 -76 4 14 -44"; while ($string =~ /-/d+/g) { $count++ } print "There are $count negative numbers in the string";
How do I capitalize all the words on one line?
To make the first letter of each word upper case:
$line =~ s//b(/w)//U$1/g;This has the strange effect of turning ``
don't do it
'' into ``Don'T Do It
''. Sometimes you might want this, instead (Suggested by Brian Foy <comdog@computerdog.com>):
$string =~ s/ ( (^/w) #at the beginning of the line | # or (/s/w) #preceded by whitespace ) //U$1/xg; $string =~ /([/w']+)//u/L$1/g;To make the whole line upper case:
$line = uc($line);To force each word to be lower case, with the first letter upper case:
$line =~ s/(/w+)//u/L$1/g;
How can I split a [character] delimited string except when inside [character]? (Comma-separated files)
Take the example case of trying to split a string that is comma-separated into its different fields. (We'll pretend you said comma-separated, not comma-delimited, which is different and almost never what you mean.) You can't usesplit(/,/)
because you shouldn't split if the comma is inside quotes. For example, take a data line like this:
SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped"Due to the restriction of the quotes, this is a fairly complex problem. Thankfully, we have Jeffrey Friedl, author of a highly recommended book on regular expressions, to handle these for us. He suggests (assuming your string is contained in $text):
@new = (); push(@new, ___FCKpd___19) while $text =~ m{ "([^/"//]*(?://.[^/"//]*)*)",? # groups the phrase inside the quotes | ([^,]+),? | , }gx; push(@new, undef) if substr($text,-1,1) eq ',';If you want to represent quotation marks inside a quotation-mark-delimited field, escape them with backslashes (eg, C<``like /''this/``''). Unescaping them is a task addressed earlier in this section.
Alternatively, the Text::ParseWords module (part of the standard perl distribution) lets you say:
use Text::ParseWords; @new = quotewords(",", 0, $text);
How do I strip blank space from the beginning/end of a string?
The simplest approach, albeit not the fastest, is probably like this:
$string =~ s/^/s*(.*?)/s*$/$1/;It would be faster to do this in two steps:
$string =~ s/^/s+//; $string =~ s//s+$//;Or more nicely written as:
for ($string) { s/^/s+//; s//s+$//; }
How do I extract selected columns from a string?
Usesubstr()
orunpack(),
both documented in the perlfunc manpage.
How do I find the soundex value of a string?
Use the standard Text::Soundex module distributed with perl.
How can I expand variables in text strings?
Let's assume that you have a string like:
$text = 'this has a $foo in it and a $bar'; $text =~ s//$(/w+)/${$1}/g;Before version 5 of perl, this had to be done with a double-eval substitution:
$text =~ s/(/$/w+)/$1/eeg;Which is bizarre enough that you'll probably actually need an EEG afterwards. :-)
See also ``How do I expand function calls in a string?'' in this section of the FAQ.
What's wrong with always quoting "$vars"?
The problem is that those double-quotes force stringification, coercing numbers and references into strings, even when you don't want them to be.If you get used to writing odd things like these:
print "$var"; # BAD $new = "$old"; # BAD somefunc("$var"); # BADYou'll be in trouble. Those should (in 99.8% of the cases) be the simpler and more direct:
print $var; $new = $old; somefunc($var);Otherwise, besides slowing you down, you're going to break code when the thing in the scalar is actually neither a string nor a number, but a reference:
func(/@array); sub func { my $aref = shift; my $oref = "$aref"; # WRONG }You can also get into subtle problems on those few operations in Perl that actually do care about the difference between a string and a number, such as the magical
++
autoincrement operator or thesyscall()
function.
Why don't my < There must be no space after the << part.
Check for these three things:
- There (probably) should be a semicolon at the end.
- You can't (easily) have any space in front of the tag.