Skip to content

Commit 27a0457

Browse files
committed
init
0 parents  commit 27a0457

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+1011
-0
lines changed

.github/workflows/test.yml

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
name: Test
2+
on:
3+
push:
4+
branches:
5+
- '**'
6+
7+
pull_request:
8+
9+
jobs:
10+
linux:
11+
runs-on: ubuntu-latest
12+
13+
steps:
14+
- uses: actions/setup-node@v1
15+
16+
- run: npx markdownlint-cli '**/*.md'

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
tmp/

.markdownlint.json

+6
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"default": true,
3+
"no-hard-tabs": false,
4+
"line-length": false,
5+
"first-line-heading": false
6+
}

.nojekyll

Whitespace-only changes.

LICENSE

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
The MIT License
2+
3+
Copyright 2020 Yad Smood
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
6+
7+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
8+
9+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

README.md

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Rod
2+
3+
Rod is a high-level driver directly based on [DevTools Protocol](https://chromedevtools.github.io/devtools-protocol).
4+
It's widely used for web automation and scraping, such as click the buttons on the page programmatically,
5+
autofill the inputs, get the image and text, etc.
6+
7+
[Next Chapter](get-started/README.md)

_navbar.md

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
2+
- [API](https://pkg.go.dev/github.com/go-rod/rod)
3+
- [Examples](https://github.com/go-rod/rod/#examples)
4+
- [FAQ](https://github.com/go-rod/rod#faq)
5+
- [Chat](https://discord.gg/CpevuvY)
6+
7+
- :earth_americas: Translations
8+
- [:uk: English](/)
9+
- [:cn: 中文](/zh-cn/)

_sidebar.md

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
2+
- Introduction
3+
4+
- [Get Started](get-started/README.md)
5+
- [Context & Timeout](context-and-timeout.md)
6+
- [Error Handling](error-handling.md)
7+
8+
- Guides
9+
10+
- [Selectors](selectors/README.md)
11+
- [Events](events.md)
12+
- [Input](input.md)
13+
- [Network](network.md)
14+
- [Emulation](emulation.md)
15+
- [Page Resources](page-resources.md)
16+
- [Javascript Runtime](javascript-runtime.md)
17+
- [Page Pool](page-pool.md)
18+
- [Customize Launch](custom-launch.md)
19+
- [Customize WebSocket](customize-websocket.md)
20+
- [CSS Selector](css-selector.md)
21+
22+
- Resources
23+
24+
- [API Reference](https://pkg.go.dev/github.com/go-rod/rod)
25+
- [Code Examples](https://github.com/go-rod/rod/#examples)
26+
- [FAQ](https://github.com/go-rod/rod#faq)
27+
- [Chat Room](https://discord.gg/CpevuvY)
28+
- [Join Development](https://github.com/go-rod/rod/blob/master/.github/CONTRIBUTING.md)
29+
- [Source Code](https://github.com/go-rod/rod)

context-and-timeout.md

+82
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
# Context and Timeout
2+
3+
In Golang, we usually use [context](https://golang.org/pkg/context/) to cancel IO blocking tasks.
4+
Because Rod uses WebSocket to talk to the browser, literally all control signals Rod send to the browser
5+
are an IO blocking. The use of context is not special for Rod, it follows the standard way.
6+
7+
## Cancellation
8+
9+
For example, we have a sample code like below, it simply creates a blank page and navigates it to the "github.com":
10+
11+
```go
12+
page := rod.New().MustConnect().MustPage("")
13+
page.MustNavigate("http://github.com")
14+
```
15+
16+
Now, suppose we want to cancel the `MustNavigate` if it takes more than 2 seconds.
17+
In Rod we can do something like this to achieve it:
18+
19+
```go
20+
page := rod.New().MustConnect().MustPage("")
21+
22+
ctx, cancel := context.WithCancel(context.Background())
23+
pageWithCancel := page.Context(ctx)
24+
25+
go func() {
26+
time.Sleep(2 * time.Second)
27+
cancel()
28+
}()
29+
30+
pageWithCancel.MustNavigate("http://github.com")
31+
```
32+
33+
We use the `page.Context` to create a shallow copy of the `page`. Whenever we call the `cancel`, the operations
34+
on the `pageWithCancel` will be canceled, it can be any operation, not just `MustNavigate`.
35+
36+
It's not special for Rod, you can find similar APIs like [this one](https://golang.org/pkg/net/http/#Request.WithContext) in the standard library.
37+
38+
Because `pageWithCancel` is not `page`, operations on `page` will not be affected by the cancellation:
39+
40+
```go
41+
...
42+
43+
pageWithCancel.MustNavigate("http://github.com") // will be canceled after 2 seconds
44+
page.MustNavigate("http://github.com") // won't be canceled after 2 seconds
45+
```
46+
47+
## Timeout
48+
49+
The code above is just a way to timeout an operation. In Golang, timeout is usually just a special case of cancellation.
50+
Because it's so useful, we created a helper to do the same thing above, it's called `Timeout`, so the code above can be reduced like below:
51+
52+
```go
53+
page := rod.New().MustConnect().MustPage("")
54+
page.Timeout(2 * time.Second).MustNavigate("http://github.com")
55+
```
56+
57+
The `page.Timeout(2 * time.Second)` is the previous `pageWithCancel`.
58+
Not just `Page`, `Browser` and `Element` also support the same context APIs.
59+
60+
## Detect timeout
61+
62+
How do I know if an operation is timed out or not? In Golang, timeout is usually a type of error. It's not special for Rod.
63+
For the code above we can do this to detect timeout:
64+
65+
```go
66+
page := rod.New().MustConnect().MustPage("")
67+
68+
err := rod.Try(func() {
69+
page.Timeout(2 * time.Second).MustNavigate("http://github.com")
70+
})
71+
if errors.Is(err, context.DeadlineExceeded) {
72+
// code for timeout error
73+
} else if err != nil {
74+
// code for other types of error
75+
}
76+
```
77+
78+
Here we use `rod.Try` to wrap the function that may throw a timeout error.
79+
80+
We will talk more about error handing at [Error Handling](error-handling.md).
81+
82+
[Next Chapter](error-handling.md)

css-selector.md

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# CSS Selector
2+
3+
This guide is design specific for web automation, we don't need learn all the concepts of CSS selector.
4+
It's the minium knowledge to get job down.
5+
6+
## Requirements
7+
8+
Make sure you have the basic knowledge of HTML, if not please take some time to learn it at:
9+
[Getting started with HTML](https://developer.mozilla.org/en-US/docs/Learn/HTML/Introduction_to_HTML/Getting_started)

custom-launch.md

+11
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
# Custom Launch
2+
3+
## Docker
4+
5+
Check [here](https://github.com/go-rod/rod#q-how-to-use-rod-with-docker-so-that-i-dont-have-to-install-a-browser)
6+
7+
## Custom executable runner
8+
9+
Check this [example](https://github.com/go-rod/rod/blob/5e2a019449e9703c2b5227ef9821811c8e88cb33/lib/launcher/example_test.go#L11)
10+
11+
[Next Chapter](/customize-websocket.md)

customize-websocket.md

+58
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
# Customize the WebSocket
2+
3+
Useful when you want to proxy the transport layer or tune the performance.
4+
Here we use the `github.com/gorilla/websocket` as an example, you can wrap any lib you like.
5+
6+
```go
7+
package main
8+
9+
import (
10+
"context"
11+
"fmt"
12+
"net/http"
13+
14+
"github.com/go-rod/rod"
15+
"github.com/go-rod/rod/lib/cdp"
16+
"github.com/go-rod/rod/lib/launcher"
17+
"github.com/gorilla/websocket"
18+
)
19+
20+
func main() {
21+
u := launcher.New().MustLaunch()
22+
23+
// Use a custom websocket lib as the transport layer for JSON-RPC
24+
client := cdp.New(u).Websocket(&MyWebSocket{})
25+
26+
p := rod.New().Client(client).MustConnect().MustPage("http://example.com")
27+
28+
fmt.Println(p.MustInfo().Title)
29+
}
30+
31+
// MyWebSocket implements the cdp.WebSocketable interface
32+
var _ cdp.WebSocketable = &MyWebSocket{}
33+
34+
type MyWebSocket struct {
35+
conn *websocket.Conn
36+
}
37+
38+
func (ws *MyWebSocket) Connect(ctx context.Context, url string, header http.Header) error {
39+
dialer := *websocket.DefaultDialer
40+
dialer.WriteBufferSize = 2 * 1024 * 1024 // 2MB
41+
42+
conn, _, err := dialer.DialContext(ctx, url, header)
43+
ws.conn = conn
44+
45+
return err
46+
}
47+
48+
func (ws *MyWebSocket) Send(b []byte) error {
49+
return ws.conn.WriteMessage(websocket.TextMessage, b)
50+
}
51+
52+
func (ws *MyWebSocket) Read() ([]byte, error) {
53+
_, data, err := ws.conn.ReadMessage()
54+
return data, err
55+
}
56+
```
57+
58+
[Next Chapter](/css-selector.md)

emulation.md

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Emulation
2+
3+
Check the [test](https://github.com/go-rod/rod/blob/5e2a019449e9703c2b5227ef9821811c8e88cb33/page_test.go#L191)
4+
5+
[Next Chapter](/page-resources.md)

error-handling.md

+79
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Error Handling
2+
3+
In the previous chapters, we have seen a lot of `Must` prefixed methods like `MustNavigate`, `MustElement`, etc.
4+
They all have non-prefixed versions like `Navigate`, `Element`, etc. The main difference between them is how
5+
they handle errors. It's not special for Rod, you can find it in the standard library like [this one](https://golang.org/pkg/regexp/#MustCompile).
6+
7+
The methods like `MustNavigate` and `MustElement` are commonly used in example code or quick scripting.
8+
They are useful for jobs like smoke testing, site monitoring, end-to-end test, etc.
9+
If you want to code for jobs with lots of uncertainty, such as web scraping,
10+
the non-prefixed version will be a better choice.
11+
12+
The prefixed version is just the non-prefixed version wrapped with an error checker.
13+
Here's the source code of the `MustElement`, as you can see it's just calling the `Element`:
14+
15+
```go
16+
func (p *Page) MustElement(selectors ...string) *Element {
17+
el, err := p.Element(selectors...)
18+
if err != nil {
19+
panic(err)
20+
}
21+
return el
22+
}
23+
```
24+
25+
## Get the error value
26+
27+
For example, the two code blocks below are basically doing the same thing in two styles.
28+
29+
The style below will usually end up in less code, but it may also catch extra errors:
30+
31+
```go
32+
page := rod.New().MustConnect().MustPage("https://example.com")
33+
34+
err := rod.Try(func() {
35+
fmt.Println(page.MustElement("a").MustHTML())
36+
})
37+
handleError(err)
38+
```
39+
40+
We use the `rod.Try` function to catch the error from the prefixed methods.
41+
42+
The style below is the standard way to handle errors. Usually, it's more consistent and precise:
43+
44+
```go
45+
page := rod.New().MustConnect().MustPage("https://example.com")
46+
47+
el, err := page.Element("a")
48+
if err != nil {
49+
handleError(err)
50+
return
51+
}
52+
html, err := el.HTML()
53+
if err != nil {
54+
handleError(err)
55+
return
56+
}
57+
fmt.Println(html)
58+
```
59+
60+
## Check the error type
61+
62+
We use Go's standard way to check error types, no magic.
63+
64+
The `handleError` in the above code may look like:
65+
66+
```go
67+
func handleError(err error) {
68+
var evalErr *rod.ErrEval
69+
if errors.Is(err, context.DeadlineExceeded) { // timeout error
70+
fmt.Println("timeout err")
71+
} else if errors.As(err, &evalErr) { // eval error
72+
fmt.Println(evalErr.LineNumber)
73+
} else if err != nil {
74+
fmt.Println("can't handle", err)
75+
}
76+
}
77+
```
78+
79+
[Next Chapter](selectors/README.md)

0 commit comments

Comments
 (0)