Skip to content

Commit

Permalink
NavigationOptions for recaptcha solving
Browse files Browse the repository at this point in the history
  • Loading branch information
MatthewZMSU committed Jan 30, 2025
1 parent f41d5fa commit 237d32d
Show file tree
Hide file tree
Showing 4 changed files with 23 additions and 12 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ Here is the list of available actions:
- `Screenshot(options)` - take screenshot
- `Har()` - to get the HAR file, pass the `har_recording=True` argument to `PuppeteerRequest` at the start of execution.
- `FillForm(input_mapping, submit_button)` - to fill out and submit forms on page.
- `RecaptchaSolver(solve_recaptcha)` - find or solve recaptcha on page
- `RecaptchaSolver(solve_recaptcha, close_on_empty, options)` - find or solve recaptcha on page
- `CustomJsAction(js_function)` - evaluate JS function on page

Available options essentially mirror [service](https://github.com/ispras/scrapy-puppeteer-service) method parameters, which in turn mirror puppeteer API functions to some extent.
Expand Down
27 changes: 17 additions & 10 deletions scrapypuppeteer/actions.py
Original file line number Diff line number Diff line change
Expand Up @@ -282,33 +282,40 @@ class RecaptchaSolver(PuppeteerServiceAction):
will happen to your 2captcha balance.
Then it solves recaptcha with 2captcha service and inserts the special code
into the page automatically.
Note that it does not click buttons like "submit buttons".
Note that it does not click buttons like "submit" buttons.
Params:
solve_recaptcha - bool = True: enables automatic solving of recaptcha on the page if found.
:param bool solve_recaptcha: (default = True) enables automatic solving of recaptcha on the page if found.
If false is provided recaptcha will still be detected on the page but not solved.
You can get info about found recaptchas via return value.
close_on_empty - bool = False: whether to close page or not if there was no captcha on the page.
:param bool close_on_empty: (default = True) whether to close page or not if there was no captcha on the page.
Response for this action is PuppeteerJsonResponse. You can get the return values
via self.data['recaptcha_data'].
You can visit
https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-recaptcha#result-object
to get information about return value.
:param dict navigation_options: Navigation options, same as GoTo action.
:param dict wait_options: Options specifying wait after navigation, same as GoTo action.
Response for this action is PuppeteerRecaptchaSolverResponse.
"""

endpoint = "recaptcha_solver"

def __init__(
self, solve_recaptcha: bool = True, close_on_empty: bool = False, **kwargs
self,
solve_recaptcha: bool = True,
close_on_empty: bool = False,
navigation_options: dict = None,
wait_options: dict = None,
**kwargs
):
self.solve_recaptcha = solve_recaptcha
self.close_on_empty = close_on_empty
self.navigation_options = navigation_options
self.wait_options = wait_options

def payload(self):
return {
"solve_recaptcha": self.solve_recaptcha,
"close_on_empty": self.close_on_empty,
"navigationOptions": self.navigation_options,
"waitOptions": self.wait_options,
}


Expand Down
3 changes: 2 additions & 1 deletion scrapypuppeteer/middleware.py
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,7 @@ def from_crawler(cls, crawler: Crawler):
if isinstance(submit_selector, str):
submit_selectors[key] = Click(selector=submit_selector)
elif not isinstance(submit_selector, Click):
raise ValueError(
raise TypeError(
"Submit selector must be str or Click,"
f"but {type(submit_selector)} provided"
)
Expand Down Expand Up @@ -256,6 +256,7 @@ def _solve_recaptcha(self, request, response):
recaptcha_solver = RecaptchaSolver(
solve_recaptcha=self.recaptcha_solving,
close_on_empty=self.__is_closing(response, remove_request=False),
navigation_options={"waitUntil": "domcontentloaded"},
)
return response.follow(
recaptcha_solver,
Expand Down
3 changes: 3 additions & 0 deletions scrapypuppeteer/response.py
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,9 @@ class PuppeteerRecaptchaSolverResponse(PuppeteerJsonResponse, PuppeteerHtmlRespo
Response for RecaptchaSolver.
Result is available via self.recaptcha_data and self.data["recaptcha_data"]
(deprecated, to be deleted in next versions) object.
You can visit
https://github.com/berstend/puppeteer-extra/tree/master/packages/puppeteer-extra-plugin-recaptcha#result-object
to get information about return value.
"""

attributes: Tuple[str, ...] = tuple(
Expand Down

0 comments on commit 237d32d

Please sign in to comment.