Requesting a page, then visiting another causes issues #53

danrossi · 2016-06-24T07:13:48Z

Sorry this is a question. There seems to be a problem requesting a page to scrape a special link, then choosing to visit that link. The page does not render or parse correctly. It seems I have to create a second session but xpath is not parsing it correctly.

ie

sess = dryscrape.Session(base_url = 'host')

# we don't need images
sess.set_attribute('auto_load_images', False)

# visit homepage and search for a term
sess.visit('/path')

links = sess.xpath('//a[contains .. ]')
link = links[0]["href"]

time.sleep(10)


sess = dryscrape.Session(base_url = 'host')

sess.visit(link)

 sess.xpath("//div[@class='searchitem']")

This is a problem I have to parse the whole body first. like

tree = fromstring(sess.body())

Unfortunately clicking on the link to visit does not work it has to choose to visit it with the visit method.

Is there a special way to reuse the session so xpath works ?

The text was updated successfully, but these errors were encountered:

danrossi · 2016-06-24T10:18:10Z

I can't explain it but for some reason on ubuntu this same code that works on OSX doesn't work on Ubuntu. The new visited link is not registered properly on the site and therefore fails and the html parsing breaks.

It can extract the link from the first page but the second page has issues.

Any ideas ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requesting a page, then visiting another causes issues #53

Requesting a page, then visiting another causes issues #53

danrossi commented Jun 24, 2016

danrossi commented Jun 24, 2016 •

edited

Loading

Requesting a page, then visiting another causes issues #53

Requesting a page, then visiting another causes issues #53

Comments

danrossi commented Jun 24, 2016

danrossi commented Jun 24, 2016 • edited Loading

danrossi commented Jun 24, 2016 •

edited

Loading