其他人已经指出,您希望/s选项使.匹配换行符,以便您可以将逻辑行边界与.*交叉。您可能还需要非贪婪.*?:
use v5.10;
my $html = <
Activation Date:10/27/2011
HTML
my $regex = qr|
Activation \s+ Date:\s*
\s*(\S+)
\s*
|xs;
if ($html =~ $regex) {
say "matched: $1";
}
else {
say "mismatched!";
}
如果你有完整的表,它更容易使用的东西,它知道如何解析表。让一个模块,如还有HTML::TableParser处理所有的细节:
use v5.10;
my $html = <
Activation Date: | 10/27/2011 |
HTML
use HTML::TableParser;
sub row {
my($tbl_id, $line_no, $data, $udata) = @_;
return unless $data->[0] eq 'Activation Date';
say "Date is $data->[1]";
}
# create parser object
my $p = HTML::TableParser->new(
{ id => 1, row => \&row, }
{ Decode => 1, Trim => 1, Chomp => 1, }
);
$p->parse($html);
use v5.10;
my $html = <
Activation Date: | 10/27/2011 |
HTML
use HTML::TableExtract;
my $p = HTML::TableExtract->new;
$p->parse($html);
my $table_tree = $p->first_table_found;
my $date = $table_tree->cell(0, 1);
$date =~ s/\A\s+|\s+\z//g;
say "Date is $date";