-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add automated way to fetch a local snapshot of domains #47
Comments
@mrchrisadams was this ever implemented, or should I put it on the backlog in our roadmap? |
hi @fershad I think sitespeed use this, and they have daily script they use, that relies on using something like wget to fetch the latest file from the list below, then converting it to a JSON file which they load, and store in memory for fast lookups https://api.thegreenwebfoundation.org/admin/green-urls You can see some of the logic here: https://github.com/thegreenwebfoundation/co2.js/blob/main/src/hosting-json.node.js#L22 TBH, now that Also our snapshots have a "last modified" column, and because the most popular urls are tend to be the most recently updated onesm we could use column to make different size snapshots based on required use case: This would be ideal for the browser extension. https://developer.mozilla.org/en-US/docs/Web/API/Web_Storage_API |
Hi! No actually we do not update it, I manually update it when I remember, so if we could have a built in way to do that, it would be great :) |
Hi @mrchrisadams and @fershad. I can pick this one up. Would it be accurate to say that both a node (eg run as a npm script, save to disk) and browser (eg download to localhost) approach are needed? |
I have part of this done already. It's the node side of things. Im not 100% clear on the use case but we could work through it together. https://github.com/tackaberry/co2.js/blob/bt/fetch_greenurls_db/data/functions/fetch_greenurls_db.js Runnning this as It creates files like:
This wouldnt work in a browser but we could adapt. |
We present daily snapshots here of the green domains dataset, using datasette.
https://datasets.thegreenwebfoundation.org/
We also have an endpoint for fetching the compressed dumps here as well:
https://api.thegreenwebfoundation.org/admin/green-urls
Having an npm script to fetch these would help to make it easy to perform fast local lookups instead of hammering the API all the time.
The text was updated successfully, but these errors were encountered: