html4如何变html5
Imagine that you have a web page that has a stock price buried inside a table cell, like this:
想象一下,您有一个网页,其股价隐藏在表格单元格中,如下所示:
<table id="stockPrices">
<tr>
<td class='name' width='250'><b>Stock:</b> ABC</td>
<td class='price' width='200'><b>Price:</b> $123.00</td>
</tr>
<tr>
<td class='name' width='250'><b>Stock:</b> XYZ</td>
<td class='price' width='200'><b>Price:</b> $100.00</td>
</tr>
</table>
A better way is to use a free PHP library called Simple HTML DOM, found here:
更好的方法是使用免费PHP库,称为简单HTML DOM,可在以下位置找到:
http://simplehtmldom.sourceforge.net/ http://simplehtmldom.sourceforge.net/It makes it EXTREMELY easy to process an HTML page and get extract data out of it. The library parses an HTML page into an object and gives you advanced searching commands so you can look for HTML tags that match certain criteria, and allows you to extract the contents in a variety of ways. In the above example, I might write something like this:
它非常容易处理HTML页面并从中提取数据。 该库将HTML页面解析为一个对象,并为您提供高级搜索命令,以便您可以查找符合特定条件HTML标记,并允许您以多种方式提取内容。 在上面的示例中,我可能会这样写:
<?php
// Include the library
require("simple_html_dom.php");
// Parse the page with the Simple HTML DOM shortcut file_get_html()
$dom = file_get_html("http://www.fakestockprices.com/some_page.html");
// NOTE: There is also one for str_get_html if you already have
// the HTML in a string variable: $dom = str_get_html("<html>...</html>");
// Find all <TR> tags inside any table with the ID of "stockPrices":
$TRs = $dom->find("table[id=stockPrices] tr");
// Now loop through and grab our values:
foreach($TRs as $TR)
{
// children(0) is the first TD inside the TR, while children(1) would be the second TD and so on...
$stockNameTD = $TR->children(0);
$stockPriceTD = $TR->children(1);
// plaintext gives us the content without any HTML formatting
$stockName = $stockNameTD->plaintext;
$stockPrice = $stockPriceTD->plaintext;
// You could also chain the commands together like this:
$stockName = $TR->children(0)->plaintext;
$stockPrice = $TR->children(1)->plaintext;
}
?>
// Find every <TD> in the page
$TDs = $dom->find('td');
// Find every <DIV> that also has class='someClass' as an attribute
$ClassyDIVs = $dom->find('div.someClass');
// Same thing as above, but the longer, more generic way
$ClassyDIVs = $dom->find('div[class=someClass]');
// Find every <IMG> with a width of 200
$AttrIMGs = $dom->find('img[width=200]');
// Find every <A> inside of a <SPAN>
$SpanLinks = $dom->find("span a");
// Find every <DIV> that also has class='someClass' as an attribute
$ClassyDIVs = $dom->find('div.someClass');
// Get the second matching element
$SecondClassyDIV = $ClassyDIVs[1];
// Find the second <DIV> that also has class='someClass' as an attribute
$SecondClassyDIV = $dom->find('div.someClass',1);
If you wanted to access any specific attribute on an element, it's right there as a property of the element:
如果您想访问元素上的任何特定属性,那么就可以将其作为元素的属性:
// Find every <A> inside of a <SPAN>
$SpanLinks = $dom->find("span a");
// Show all the HREFs
foreach($SpanLinks as $SpanLink)
{
echo $SpanLink->href . "\n";
}
// Find all images and update their src
$IMGs = $dom->find('img');
foreach($IMGs as $IMG)
{
$IMG->src = "/some_image.jpg";
}
// Generate and display the modified HTML document
echo $dom;
翻译自: https://www.experts-exchange.com/articles/10277/HTML-Manipulation-Made-Easy.html
html4如何变html5