我似乎遇到了一些问题,要么使用
HTML ::
HTML5 :: Microdata :: Parser或RDF :: Query,要么使用SPARQL语法和语义.我对
news site page的这一点感兴趣.
这是我的测试代码:
#! env perl
use strict;
use Data::Dumper;
use HTML::HTML5::Microdata::Parser;
use RDF::Query;
use IO::Handle;
use LWP::Simple;
STDOUT->binmode(":utf8");
STDERR->binmode(":utf8");
my $htmldoc = LWP::Simple::get(
"http://zpravy.idnes.cz/zacinaji-zapisy-do-prvnich-trid-dn3-/domaci.aspx?c=A160114_171615_domaci_zt");
die "Could not fetch URL. $@" unless defined $htmldoc;
my $microdata = HTML::HTML5::Microdata::Parser->new (
$htmldoc, $ARGV[0],
{auto_config => 1, tdb_service => 1, xhtml_meta => 1, xhtml_rel => 1});
print STDERR "microdata->graph:\n", Dumper($microdata->graph), "\n";
my $query = RDF::Query->new(<
PREFIX schema:
SELECT *
WHERE {
?author a schema:Person .
}
SPARQL
my $people = $query->execute($microdata->graph);
print STDERR "authors from RDF:\n", Dumper($people), "\n";
while (my $person = $people->next) {
print STDERR "people: ", $person, "\n";
}
HTML :: HTML5 :: Microdata :: Parser的选项只是我努力完成这项工作的最后努力. (我基本上不知道我在做什么.)
任何想法如何使这项工作,并得到作者的名字?