simple html dom 属性,PHP Simple HTML DOM Parser

I am very happy to announce the second release candidate for the next major version of simplehtmldom. It brings very important bug fixes, performance improvements and a few new features.

Important: This is a release candidate, which means some features might not yet be stable or emit unexpected behavior. Please don't hesitate to report broken or unstable features.

Here are the most notable changes:... read more

Posted by

6425bf56acbefd01635cd255acc5d756.png

2019-11-09

This has been requested many times and now it's here. The new composer package is available for current master:

composer require simplehtmldom/simlehtmldom dev-master

require_once 'vendor/autoload.php';

use simplehtmldom\HtmlWeb;

echo (new HtmlWeb())->load('https://google.com/')->find('title', 0)->plaintext;

Unfortunately it doesn't seem possible to automate the package with sourceforge, so I connected it with the GitHub fork instead.

Posted by

6425bf56acbefd01635cd255acc5d756.png

2019-10-20

I am happy to announce the first release candidate for the next major version of the parser. It brings exciting new features and performance improvements.

Important: This is a release candidate, which means some features might not yet be stable or emit unexpected behavior. Please don't hesitate to report broken or unstable features.

Here are the most notable changes:

Missing optional end tags like are being handled more efficiently. This results in much faster seek operations, especially on large documents. A performance boost of 10x or higher compared to version 1.9 is possible (when working with a lot of unclosed end tags).... read more

Posted by

6425bf56acbefd01635cd255acc5d756.png

2019-10-20

I'm happy to announce the release of PHP Simple HTML DOM Parser 1.9!

This release is focused on bug fixes and updates to the manuals but also brings a few new functions.

Please note that this will be the last 1.x release (except for bug fixes maybe). More details will be made available in the future.

Most notable changes in this version... read more

Posted by

6425bf56acbefd01635cd255acc5d756.png

2019-05-30

Great news to anyone who aims for secure data transmission!

The project page at http://simplehtmldom.sourceforge.net now redirects to https://simplehtmldom.sourceforge.io, which is much more secure (using HTTPS) and reliable (PHP 7.x) than the "old" server (HTTP + PHP 5.4)!

But there is more!

For the past weeks I've been working on updating the existing documentation.

It is not yet available on the main page, but you can take a look at https://simplehtmldom.sourceforge.io/docs... read more

Posted by

6425bf56acbefd01635cd255acc5d756.png

2019-04-16

Important Version 1.8 was replaced by 1.8.1 in order to fix critical bugs.

This release introduces lots of bug fixes and adds support for many exciting CSS features we have been longing for!

Most notable changes:

Universal selectors (*) now works as expected.

All primary CSS features are now supported with the addition of these:CSS combinators (>, +, ~)

Attribute selectors (|=, ~=)

Multiclass selectors (.class.class.class)

Multiattribute selectors ([attr1][attr2][attribute3])

Case sensitivity selectors in attributes (i and s)... read more

Posted by

6425bf56acbefd01635cd255acc5d756.png

2019-01-13

This release introduces bug fixes to the DOM parser and most importantly makes the project compatible to the most recent release of PHP 7.3, for which compatibility issues have been reported!

Most notable changes:

Compatible to PHP 7.3+

Major performance improvement of about 30% for the parser alone!

Improved handling of void tags and optional closing tags

Lots of bug fixes for the parser (selectors will be targeted in the next release)

Unit tests for reported bugs, using PHPUnit (so you can perform tests on your own fork now!)... read more

Posted by

6425bf56acbefd01635cd255acc5d756.png

2018-12-10

This step is neccessary for getting back on track with new releases. The upcoming release 1.7 will be made available shortly!

Posted by

6425bf56acbefd01635cd255acc5d756.png

2018-12-10

Find the Git repository on the Repository tab.

With the Git repository you can fork the project, browse the commit history and open merge requests!... read more

Posted by

6425bf56acbefd01635cd255acc5d756.png

2018-11-26

I'd love to find someone to revamp the simplhtmldom.sourceforge.net "help"/"manual" html pages. I hate the current look, and would love to see a far more readable easy to follow set of pages. Once I can get the look, and structure overhauled, I have a number of features that are at present undocumented, that I can add to the documentation.

I'm not looking for an ongoing commitment to this project, merely an overhaul of a set of html pages that are the "manual" for the project.... read more

Posted by

80091ff2861142a23c11a2cf86784b90?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2012-10-16

I've migrated to the new sourceforge project format. It doesn't appear that anything was lost, and the documentation homepage has stayed the same. I'm planning on making a pass thru the documentation sometime soon to make it way more up to date. If anyone wants to help make the formatting of those pages nicer, I'd be happy to take some help. Email me at John_Schlick@hotmail.com

I've also changed the debugging code inside of simple_html_dom to support the sourceforge debugobject project (download it at: https://sourceforge.net/projects/debugobject/ it's cool!).... read more

Posted by

80091ff2861142a23c11a2cf86784b90?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2012-10-10

I have added a lot of little features and enhancements over the last year.

A number of internal issues have been ironed out, and a few new features have been added (the ability to search for specific text inside of a tag, the ability to discover the original display size of an IMG tag, and a few other little things.

Please download the code from the repository as thats ALWAYS the most current.

Many thanks to the person that emailed me the very comprehensive list of changes to support alternate character sets, the ->plaintext output is MUCH better now.

John.

Posted by

80091ff2861142a23c11a2cf86784b90?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2012-04-20

Sourceforge just allowed me to take over the project. As such, I have updated the source that I have spent the last year working on.

Memory leak is fixed.

simple_html_dom now detects the character set.

plaintext looks better since it understands more about newlines in html and what things ought to look like.

All changes are fully configurable.

Many more little changes. Docs to come over the next week or two.

Posted by

80091ff2861142a23c11a2cf86784b90?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2011-07-14

Supports xpath generated from Firebug.

New method "dump" of "simple_html_dom_node".

New attribute "xmltext" of "simple_html_dom_node".

remove preg_quote on selector match function: [attribute*=value];

Element "Comment" will treat as children.

Fixed the problem with

.

Fixed bug #2207477 (does not load some pages properly).

Fixed bug #2315853 (Error with character after < sign).

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-12-18

Negative indexes supports of "find" method, thanks for Vadim Voituk.

Constructor with automatically load contents either text or file/url, thanks for Antcs.

Fully supports wildcard in selectors.

Fixed bug of confusing by the < symbol inside the text.

Fixed bug of dash in selectors.

Fixed bug of .

Fixed bug #2155883 (Nested List Parses Incorrectly).

Fixed bug #2155113 (error with unclosed html tags).

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-10-25

New method "getAllAttributes" of "simple_html_dom_node".

Fix the bug of selector in some critical conditions.

Fix the bug of striping php tags.

Fix the bug of remove_noise().

Fix the bug of noise in attributes.

Supports full javascript string in selector: $e->find("a[οnclick=alert('hello')]").

Change selector filter: "*=" to case-insentive.

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-09-05

Performance turning (boost 10%).

Memory requirement reduce 25%.

Change function name from "file_get_dom()" to "file_get_html()".

Change function name from "str_get_dom()" to "str_get_html()".

Fixed bug #2011286 (Error with unclosed html tags).

Fixed bug #2012551 (Error parsing divs).

Fixed bug #2020924 (Error for missed tag.).

Fixed bug (problem with

tag's innertext).

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-08-03

Performance turning (boost 20%).

Supports "multiple calss" selector feature:

New "callback function" feature.

New "multiple selectors" feature: $dom->find('p,a,b');

New examples.

Supports extract contents from HTML features: $dom->plaintext;

Fix the bug of $dom->clear().

Fix the bug of text nodes' innertext.

Fix the bug of comment nodes' innertext.

Fix the bug of decendent selector with optional tags.

Change simple_html_dom_node method name from "text()" to "makeup()".

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-06-24

Important!! file and class name changed (html_dom_parser->simple_html_dom)!

Important!! ($dom->save_file) will not support anymore.

New node type "comment" (eg. $dom->find('comment')).

Add self-closing tags: 'base', 'spacer'.

Fix the bug of outertext (th).

Fix the bug of regular expression escaping chars ($dom->find).

Fix the bug while line-breaker and "\t" in tags.

Remove example "example_customize_parser.php".

New example "simple_html_dom_utility.php".

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-05-09

(Request #1936000) New DOM operations(first_child, last_child, next_sibling, previous_sibling).

New method to remove attribute.

Add the solution while server behind proxy in FAQ (Thanks to Yousuke Shaggy).

Add traverse section in manual.

Now file_get_dom supports full file_get_contents parameters.

Fix the bug of self-closing tags in the end of file.

Fix the bug of blanks in the end of tag.

Add Reference section in manual.

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-04-27

New attribute filters (Thanks to Yousuke Kumakura).

Fix the bug of optional-closing tags.

Fix the bug of parsing the line break next to the tag's name.

Supports tag name with namespace.

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-04-13

Stop infinity loop while tthe source content is BAD HTML.

Fix the bug of adding new attributes to self closing tags.

Fix the bug of customize parser without $dom->remove_noise();

Add FAQ section in manual.

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-04-06

Fix the bug of parsing end-tag.

Fix the bug of endless "

Fix the bug of "remove_noise" method while stripping out tags.

Modify "example_customize_parser.php" with better regular expressions.

Add some guidelines for parser customization.

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-03-31

Fix the bug of

New feature: "plaintext" attribute to scrape pure text.

Three scraping examples.

Posted by

c737b1d27a6ccac090a1d375c1cf85c0?r=pg&d=https%3A%2F%2Fa.fsdn.com%2Fallura%2Fnf%2F1622668056%2F_ew_%2Ftheme%2Fsftheme%2Fimages%2Fsandiego%2Ficons%2Fdefault-avatar.png&s=16

2008-03-25

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值