Skip to content

Conversation

Bluesy1
Copy link
Collaborator

@Bluesy1 Bluesy1 commented Aug 29, 2024

This should allow us to limit changes to public/instructor variants of every question every time a new question is added to a problem bank ....

Also, this makes it so tests don't need to set seeds in the import section of files, making the templates closer to how they should actually be used.

@Bluesy1
Copy link
Collaborator Author

Bluesy1 commented Aug 29, 2024

@firasm I don't know if this is something you want or not ... I'll leave this for your consideration since its a somewhat substantial change to how the public bank sites get built

@firasm
Copy link
Contributor

firasm commented Aug 30, 2024

Well, the original intent was to constantly change it (on every push) so that they're less googlable and scrapeable by Chegg.

The world has now changed and now we have bigger problems (ChatGPT), where it's irrelevant to change features on every push.


PS. I noticed that the random seeds were removed from the source test files when you solved the problem with the duplicated imports.

This is addressed by this right ?

@Bluesy1
Copy link
Collaborator Author

Bluesy1 commented Aug 30, 2024

Well, the original intent was to constantly change it (on every push) so that they're less googlable and scrapeable by Chegg.

The world has now changed and now we have bigger problems (ChatGPT), where it's irrelevant to change features on every push.

I see - I had seen this from the opposite angle, by changing the variant every commit, after anough time, we've effectively published every variant of a question if someone was to go look through the public git history in one of the public problem banks.

If you really wanted to stop scraping, you should probably disallow scraping of the problem banks on your website via updating your robots.txt

I would imagine something like this would do:

Simpler

User-agent: *
Disallow: /oer/ # block all of the oer resources

Sitemap: https://firas.moosvi.com/sitemap.xml

More Precise

User-agent: *
Disallow: /oer/datascience_bank/
Disallow: /oer/physics_bank/
Disallow: /oer/stats_bank/ 

Sitemap: https://firas.moosvi.com/sitemap.xml

PS. I noticed that the random seeds were removed from the source test files when you solved the problem with the duplicated imports.

This is addressed by this right ?

They were never removed fully, just not duplicated (current main):

import random as rd; rd.seed(111)
import pandas as pd
import problem_bank_helpers as pbh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants