-
Crawlee optimizes requests by suppressing redundant URLs and giving up on a URL after reaching a configurable retry limit. This is great, but I encountered an edge case. Situation: You're extracting data from a page and realize that it wasn’t downloaded properly. Ideal Goal: Re-queue the URL while decrementing its retry counter. Alternative: Add the URL back to the queue with a fresh retry counter. How can I achieve either of these? |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 3 replies
-
You can enqueue it again with a different |
Beta Was this translation helpful? Give feedback.
-
I am afraid this is not possible.
You can use the request = Request.from_url('https://crawlee.dev/', always_enqueue=True) which will generate "unique" |
Beta Was this translation helpful? Give feedback.
-
Thank you for the helpful answer. |
Beta Was this translation helpful? Give feedback.
-
I searched the documentation but couldn’t find an example of ‘await queue.add_request(request)’, so I’m unsure how to import or instantiate ‘queue’. Nothing breaks when I try the following—would this be considered acceptable?
|
Beta Was this translation helpful? Give feedback.
I am afraid this is not possible.
You can use the
always_enqueue=True
:which will generate "unique"
unique_key
under the hood. And add the request to the request queue.