-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/more fields for bulk add by party form #2496
Conversation
b7e79e0
to
26be521
Compare
058d08e
to
c8fc8ea
Compare
6ba7224
to
7ca84df
Compare
@@ -115,20 +115,154 @@ def test_submit_name_for_area(self): | |||
|
|||
form = response.forms[1] | |||
# Now submit the valid form | |||
with self.assertNumQueries(FuzzyInt(49, 54)): | |||
with self.assertNumQueries(FuzzyInt(49, 72)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I don't know what this assertion is for, I just made the numbers go up until the tests wouldn't fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because in Django it's quite easy to write an N+1 problem. The assertion is making sure we're not calling the database loads of times when we show a page.
I've not checked fully here, but I suspect we're going to see more DB activity when writing more data to the database.
It would be good to raise the other side of the fuzzy int though. That class is a range of min/max to deal with e.g tests running in a different order. I'd expect them to be within about 5 queries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a bit of tweaking in 61fb1aa
def save_person_identifiers(self, pids_dict, person): | ||
# TODO: Saving PIDs like this doesn't record a person version | ||
for pid_type, pid in pids_dict.items(): | ||
PersonIdentifier.objects.update_or_create( | ||
person=person, | ||
value_type=pid_type, | ||
defaults={"value": pid}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried:
def update_person_identifiers(self, pids_dict, person, request, source):
for pid_type, pid in pids_dict.items():
PersonIdentifier.objects.update_or_create(
person=person,
value_type=pid_type,
defaults={"value": pid},
)
change_metadata = get_change_metadata(request, source)
person.record_version(change_metadata)
person.save()
but 'record_version' wasn't finding the newly created/updated person_identifiers. I couldn't quite figure out why; I think it has something to do with the fact that get_person_as_version_data
prefetches the person_identifiers? When I commented out the following line:
prefetch_related_objects([person], "tmp_person_identifiers") |
Then it successfully created a record. However, it meant if you made a new person. there would be 3 initial version records (one from add_person
, one from save_person_identifiers
, and one from update_person
) so I'm not convinced that this is the best way to do it anyway.
I thought about just passing the pids_dict
to update person
instead of adding the save_person_identifiers
method but the update person
is already doing a bunch of stuff that I'm wary of adding more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have a read more, but I think this needs to be done right at the end of the process. We'll need to save a new person first, add the PIs then get the version data and save again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have implemented what you suggested in Slack in 9ef60a1 with one tweak. calling refresh_from_db()
after set_person_fields()
was undoing the latter because those those fields aren't saved until the person object is saved in update_person()
. So I just moved set_person_fields()
to be after the refresh_from_db()
.
self.fields["previous_party_affiliations"] = ( | ||
PreviousPartyAffiliationsField(membership=self.instance) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This snuck in from a ruff format that I ran.
URLValidator()(pid) | ||
|
||
# Validate pid: | ||
try: | ||
if pid_type == "instagram_url": | ||
clean_instagram_url(pid) | ||
elif pid_type == "linkedin_url": | ||
clean_linkedin_url(pid) | ||
elif pid_type == "mastodon_url": | ||
clean_mastodon_username(pid) | ||
elif pid_type == "twitter_url": | ||
clean_twitter_username(pid) | ||
elif pid_type == "wikidata_id": | ||
clean_wikidata_id(pid) | ||
elif pid_type == "email": | ||
clean_email(pid) | ||
except ValueError as e: | ||
raise ValidationError(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to port all the existing validation to this field, but I may have missed a few so it would be good to know if you see any obvious ones. Facebook, for example?
Update to the I tried setting DATA_UPLOAD_MAX_NUMBER_FIELDS to I checked the length of the request object and it was 1168. From looking at the POST request in the debug toolbar:
I can't think of a clever way to cut this down, short of removing some fields. I could only have two person_identifier fields instead of 3? Or I could remove the some/all of the demographic fields. Or we could set
However, neither of these choices are futureproof since their may be future elections with an amount of ballots/seats greater than whatever 'DATA_UPLOAD_MAX_NUMBER_FIELDS' is set to (unless it's set to |
Drive-by comment.. Just another quick thought on request size: If we're making forms that allow you to fill in 50 candidates worth of data or whatever, we should check submitting those with long data in every single field. I am pretty sure that as well as the number of form fields in a single post request, there will also be some arbitrary limits on things like the size (in bytes) of a single POST request and the execution time we can run for creating all those DB objects (and related objects). Being able to create in bulk is a big time-saver if it works, but imagine how frustrating it would be to fill in data for ~50 ballots and then get a 500 at the end of it. |
3c355fd
to
90d3d86
Compare
Yeah that's a fear of mine ha. I've tried to create a stress test in 90d3d86. It creates an election with 100 ballots each with a post then it fills in the all the fields in the form and submits it. Currently it's failing, but I did get it to pass when I added I'm also unsure if using WebTest like I am is the best way to do this type of test? |
7f189c0
to
1d290e0
Compare
because we now iterate over all the form fields wehn we render them, the party list field was showing up twice
This is so we don't error on bulk add by party forms with lots of ballots
1d290e0
to
7eac450
Compare
This PR adds more fields the bulk add by party form. For person identifiers, I ended up created a some custom
MultiValueField
s and correspondingMultiWidget
s because it made more sense to me than a huge and confusing copy-paste. They still duplicate a lot of validation methods in a way that could be much cleaner, but as we've discussed, it's too much work to do before SOPN dayThere are a few outstanding issues that I'm not sure how to address yet:
Elections with lots of ballots raise an error. This is a big one as I don't think the form will be that useful otherwise:
The number of GET/POST parameters exceeded settings.DATA_UPLOAD_MAX_NUMBER_FIELDS.
localhost8080:/bulk_adding/party/local.buckinghamshire.2025-05-01/PP52/
The way I've saving person identifiers is not captured by the person versioning system. I tried to adapt the method to record the person version but haven't figured out a way to make it work.
There are also a few things I haven't implemented yet. However, I think there's already enough here that I could use a review even before I add these:
Some things that I'd like to improve in the future:
Useful elections for local testing:
local.swansea.2025-03-27
- one ballot, one seatlocal.glasgow-city.2025-03-20
- two ballots, one seat each (needs to be unlocked)local.city-of-london.2025-03-20
- many ballots with more than one seat