You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PHPHtmlParser is a simple, flexible, html parser which allows you to select tags using any css selector, like jQuery. The goal is to assist in the development of tools which require a quick, easy way to scrap html, whether it's valid or not!
This package can be found on [packagist](https://packagist.org/packages/paquettg/php-html-parser) and is best loaded using [composer](http://getcomposer.org/). We support php 7.2, 7.3, and 7.4.
18
20
19
-
## Basic Usage
21
+
Basic Usage
22
+
-----
20
23
21
24
You can find many examples of how to use the DOM parser and any of its parts (which you will most likely never touch) in the tests directory. The tests are done using PHPUnit and are very small, a few lines each, and are a great place to start. Given that, I'll still be showing a few examples of how the package should be used. The following example is a very simplistic usage of the package.
The above will output "click here". Simple, no? There are many ways to get the same result from the DOM, such as `$dom->getElementsbyTag('a')[0]` or `$dom->find('a', 0)`, which can all be found in the tests or in the code itself.
35
38
36
-
## Support PHP Html Parser Financially
39
+
Support PHP Html Parser Financially
40
+
--------------
37
41
38
42
Get supported Monolog and help fund the project with the [Tidelift Subscription](https://tidelift.com/subscription/pkg/packagist-paquettg-php-html-parser?utm_source=packagist-paquettg-php-html-parser&utm_medium=referral&utm_campaign=enterprise).
39
43
40
44
Tidelift delivers commercial support and maintenance for the open source dependencies you use to build your applications. Save time, reduce risk, and improve code health, while paying the maintainers of the exact dependencies you use.
41
45
42
-
## Loading Files
46
+
Loading Files
47
+
------------------
43
48
44
49
You may also seamlessly load a file into the DOM instead of a string, which is much more convenient and is how I expect most developers will be loading the HTML. The following example is taken from our test and uses the "big.html" file found there.
45
50
@@ -57,7 +62,7 @@ foreach ($contents as $content)
57
62
{
58
63
// get the class attr
59
64
$class = $content->getAttribute('class');
60
-
65
+
61
66
// do something with the html
62
67
$html = $content->innerHtml;
63
68
@@ -69,9 +74,10 @@ foreach ($contents as $content)
69
74
70
75
This example loads the html from big.html, a real page found online, and gets all the content-border classes to process. It also shows a few things you can do with a node but it is not an exhaustive list of the methods that a node has available.
71
76
72
-
## Loading URLs
77
+
Loading URLs
78
+
----------------
73
79
74
-
Loading a URL is very similar to the way you would load the HTML from a file.
80
+
Loading a URL is very similar to the way you would load the HTML from a file.
75
81
76
82
```php
77
83
// Assuming you installed from Composer:
@@ -102,7 +108,8 @@ $html = $dom->outerHtml;
102
108
103
109
As long as the client object implements the interface properly, it will use that object to get the content of the url.
You can also set parsing option that will effect the behavior of the parsing engine. You can set a global option array using the `setOptions` method in the `Dom` object or a instance specific option by adding it to the `load` method as an extra (optional) parameter.
122
130
@@ -133,7 +141,7 @@ $dom->setOptions(
133
141
->setStrict(true)
134
142
);
135
143
136
-
$dom->loadFromUrl('http://google.com',
144
+
$dom->loadFromUrl('http://google.com',
137
145
(new Options())->setWhitespaceTextNode(false) // only applies to this load.
138
146
);
139
147
@@ -190,7 +198,8 @@ This option contains an array of all self closing tags. These tags must be self
190
198
191
199
This option contains an array of all tags that can not be self closing. The list starts off as empty but you can add elements as you wish.
192
200
193
-
## Static Facade
201
+
Static Facade
202
+
-------------
194
203
195
204
You can also mount a static facade for the Dom object.
The above php block does the same find and load as the first example but it is done using the static facade, which supports all public methods found in the Dom object.
206
215
207
-
## Modifying The Dom
216
+
Modifying The Dom
217
+
-----------------
208
218
209
219
You can always modify the dom that was created from any loading method. To change the attribute of any node you can just call the `setAttribute` method.
0 commit comments