Hopefully should be a simple question for someone that has done it before!
I have a list of old web documents in table format with lots of contact details in it. What I have managed so far is to create a PHP script that parses the XHTML doc and pull out old client contact details.
An example of the document format:
Indigo Blue 123 123 Blue House Hanley ST13 4SN Stoke on Trent 01875 322511 www.indigoblue123.org.ukWhat I need to do is parse all of these contact details into an array. The few things that I'm not sure on how to complete is grabbing the empty blocks to be empty array entries (i.e. Address 2 and Address 3 will be blank but I need to know this) as well as grabbing the web address from the .. block.
So far I have figured all populated data has class=details in some form. However, as I mentioned before I'm not sure what the best way to accomplish the overall result is. There around 20-40 entries in the different files I have.
I have managed the basics with this so far:
print '
';
$html = file_get_contents('old-contacts.xhtml');
// Create new DOM object:
$dom = new DomDocument();
// Load HTML code:
$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$details = $xpath->query("//table/tbody/tr[td/font/@class = 'details']");
for ($i = 0; $i < $details->length; $i++) {
$data[$i]['data'] = $details->item($i)->nodeValue;
echo $data[$i]['data'];
}
print '
';?>
Any help would be great!
Thanks