Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise common web user agents, add command-line and web crawlers #11

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

csarven
Copy link
Member

@csarven csarven commented Feb 24, 2025

Copy link
Contributor

@jyasskin jyasskin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

index.bs Outdated
Comment on lines 44 to 45
Web crawlers and automated bots that collect or
analyze data from websites function as user agents too.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are only user agents when they're helping end-users interact with those sites. So the system they're embedded in could be a user agent, but the crawlers and bots themselves aren't.

Copy link

@martinthomson martinthomson Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't agree that web crawlers are user agents, except in the narrow conditions Jeffrey points to. I would not support the notion that they are assumed to be user agents for that reason. Instead, I would say that many tools can sometimes act in the role of user agent, when directed toward achieving goals set by or for a person. Then use crawlers as one of a set of examples, maybe. The example of curl or wget also fits that description, though not when used as part of a scripted CI system, for example.

In all these cases, being a user agent is not the natural state of the thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also related to #12, where we've been asked to clarify how libraries that help implement UAs should interact with this document. I suggest we drop these lines from this change, and address the idea more thoroughly while fixing #12.

Copy link
Member Author

@csarven csarven Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need more discussion/clarification on this so I've created #14 and removed the text

index.bs Outdated
Comment on lines 42 to 43
Command-line tools like curl also qualify as user agents,
enabling users to retrieve web content without rendering it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is true, at least when an end user is driving curl, as opposed to it being part of a larger system, but I'm not sure that it's worth the reader's time to call it out. Are there any requirements set by the rest of the document that you think command-line tools are at risk of violating? Or other concrete implications of making the reader's mental model include these tools?

Copy link
Member Author

@csarven csarven Mar 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need more discussion/clarification on this so I've created #14 and removed the text

@csarven csarven force-pushed the common-web-user-agents branch from dbaa96c to 1f46ae6 Compare March 5, 2025 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants