update from Gluejar#94
Open
eshellman wants to merge 1122 commits into
Open
Conversation
2023 final
update handling of DOAB coers
fix all the null doab covers!
and don' call distinct unless needed
update tests, fix slow OPDS, optimize queryset access
muse, ubiquity hosts
tecnum, update de gruyter
springer, sciello and cmp
Maintenance 2024
Switches the pyoai pin from infrae/pyoai (last release March 2022, explicitly unmaintained per its own README) to our fork carrying the RateLimitedError patch: EbookFoundation/pyoai @ bf709d26ae6e4b34b9b0ca726e0f032d97f0bd38 Branch: fix/expose-429-as-rate-limited-error The fork raises a structured RateLimitedError on HTTP 429 with Retry-After parsed per RFC 9110, instead of leaving callers to parse HTTPError.headers themselves. An upstream PR to eth-library/oaipmh (the maintained fork; infrae/pyoai is no longer accepting changes) is pending; we'll un-pin to that fork when it lands. load_doab_oai now catches RateLimitedError first; the existing HTTPError-with-code-429 path is kept as a runtime fallback for deployments that revert the pin or run against stock pyoai. A never-raised placeholder class handles the ImportError case so the except clause is always syntactically valid. Step C of today's DOAB rate-limit plan (after #1143/#1144 OAI sentinel and #1147 bitstream breaker). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pin pyoai to EbookFoundation fork; consume RateLimitedError
Resolve settings/dev.py conflict: keep Celery 5.x setting renames from this branch (CELERY_BEAT_SCHEDULE, CELERY_WORKER_HIJACK_ROOT_LOGGER) plus master's ADMINS email update; both sides drop the send_test_email schedule line. Brings in 16 master commits since fork (DOAB 429 handling, pyoai EbookFoundation fork pin, notarobot int-guard, SEND_TEST_EMAIL_JOB removal, admin-email rate limit). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Compute the publication year range from a single evaluated edition queryset instead of two separate (asc + desc) queries. The previous code initialized `latest_publication` to None and set it only from the second query; when the dated-edition set changed between the two reads (e.g. concurrent edition loading) the second query could find no truthy dates, leaving `latest_publication` None and crashing on `earliest_publication + "-" + latest_publication`. Also switch to save(update_fields=['publication_range']) so this nominally-read property doesn't persist other (possibly stale) in-memory Work fields. Adds WorkTests regression coverage: mixed null/blank/valid dates, single-year vs range, all-blank, and a query-count guard proving the single-query shape. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The api `widget` view treated any non-"featured", non-10/13-char token
as a work id and called safe_get_work() without catching
Work.DoesNotExist, so /api/widget/<bad-id>/ returned 500 instead of the
existing empty-widget response. Two related defects fixed at the same
time:
- convert_10_to_13() returns None for an invalid ISBN-10, after which
`len(isbn)` raised TypeError. Guard with `if isbn and len(isbn)==13`.
- widget.html renders "...ISBN {{isbn}}..." but the work=None paths did
not pass `isbn`. Pass it into every render path.
Wrap safe_get_work() in try/except Work.DoesNotExist -> render empty
widget, matching the existing Identifier.DoesNotExist handling.
Adds api ApiTests regression coverage: non-numeric token, numeric
unknown id, and invalid ISBN-10 all return 200.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Keeps the Python 'if date' guard as belt-and-suspenders: the structural invariant (years contains only truthy strings) stays enforced locally, independent of the queryset. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…into dj42.unglue.it
…to dj42.unglue.it
Fix #1155: TypeError in Work.publication_date from two-query race
Fix #1156: widget endpoint 500s on unknown/non-numeric/invalid ids
Staging boxes restored from a prod snapshot keep prod's Site row (domain='unglue.it'), so every emailed link (password-reset, notices, etc.) points at prod instead of the staging box's own host. This command updates Site.objects.get_current() (the SITE_ID row) to the supplied domain (and optional name). It is idempotent: if the row already matches, it prints a no-op message and exits cleanly. Used by the provisioning repo's post-deploy Ansible task to localise the Site to the box's own server_name on every deploy. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mechanical, no-meaning-change corrections to the FAQ page surfaced by the CC+Codex copy review on #1165. Typos, a broken URL, proper-noun/acronym fixes, subject-verb agreement, and site-name casing — nothing factual. - "that why" → "that's why"; "an non-profit" → "a non-profit" - "do well be selling" → "by selling"; "the the copyright" → "the copyright" - "right holder tools" → "rights holder tools"; "some interested" → "some interest" - broken Facebook URL "facebook/com" → "facebook.com" - "Wikisources/Hathi Trust/Github" → "Wikisource/HathiTrust/GitHub" - "page.You'll" → "page. You'll"; mid-sentence "Let" → "let" - "cannot not be obtained" → "cannot be obtained"; "They does" → "They do" - "Authors' Guild" → "Authors Guild"; CC license styling "NoDerivatives, NonCommercial" - site-name casing unglue.it → Unglue.it; "Thanks for Ungluing" → "Thanks-for-Ungluing" Factual/staleness/voice items (fees, payouts, sender email, campaign retirement, etc.) are handled separately in the judgment-call PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FAQ copy: objective typo & grammar fixes (#1165)
Mechanical, no-meaning-change corrections to the FAQ page surfaced by the CC+Codex copy review on #1165. Typos, a broken URL, proper-noun/acronym fixes, subject-verb agreement, and site-name casing — nothing factual. - "that why" → "that's why"; "an non-profit" → "a non-profit" - "do well be selling" → "by selling"; "the the copyright" → "the copyright" - "right holder tools" → "rights holder tools"; "some interested" → "some interest" - broken Facebook URL "facebook/com" → "facebook.com" - "Wikisources/Hathi Trust/Github" → "Wikisource/HathiTrust/GitHub" - "page.You'll" → "page. You'll"; mid-sentence "Let" → "let" - "cannot not be obtained" → "cannot be obtained"; "They does" → "They do" - "Authors' Guild" → "Authors Guild"; CC license styling "NoDerivatives, NonCommercial" - site-name casing unglue.it → Unglue.it; "Thanks for Ungluing" → "Thanks-for-Ungluing" Factual/staleness/voice items (fees, payouts, sender email, campaign retirement, etc.) are handled separately in the judgment-call PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> (cherry picked from commit 0c43233)
… not a hard fail Codex review of #1164 fix: a fresh/scrubbed DB without a Site row for SITE_ID would crash the post-deploy task with Site.DoesNotExist. get_or_create makes it self-healing while staying idempotent on existing rows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add set_site_domain management command (fix #1164)
Staging boxes restored from a prod snapshot keep prod's Site row (domain='unglue.it'), so every emailed link (password-reset, notices, etc.) points at prod instead of the staging box's own host. This command updates Site.objects.get_current() (the SITE_ID row) to the supplied domain (and optional name). It is idempotent: if the row already matches, it prints a no-op message and exits cleanly. Used by the provisioning repo's post-deploy Ansible task to localise the Site to the box's own server_name on every deploy. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… not a hard fail Codex review of #1164 fix: a fresh/scrubbed DB without a Site row for SITE_ID would crash the post-deploy task with Site.DoesNotExist. get_or_create makes it self-healing while staying idempotent on existing rows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
libraryauth defined its AppConfig (with ready() -> `from . import signals`) in __init__.py and relied on `default_app_config`. Django 4.1 REMOVED default_app_config, and an AppConfig in __init__.py is not auto-discovered (Django only scans <app>/apps.py). So since the 2026-06-17 Django 4.2 cutover, LibraryAuthConfig.ready() never ran, signals.py was never imported, and the `@receiver(user_activated) handle_same_email_account` (same-email account dedup on registration activation) was silently disconnected in production. Fix: move LibraryAuthConfig to libraryauth/apps.py (auto-discovered) and drop the dead default_app_config. Backward-compatible (valid on 4.2 and 5.2). Proven empirically (minimal repro): with the config in __init__.py + no apps.py, ready() does NOT fire on either Django 4.2.21 or 5.2.15; adding apps.py restores it on both. Scope: swept all first-party apps — only `core` and `libraryauth` define ready(); core already has apps.py (fine). libraryauth was the sole casualty. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
Per Codex review of #1176: assert LibraryAuthConfig is the active app config (so ready() runs) and that handle_same_email_account is connected to the user_activated signal. Guards against the config drifting back out of apps.py. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
Fix libraryauth signals: move AppConfig to apps.py so ready() fires
recommended_user (frontend/views/__init__.py:637) is a QuerySet (User.objects.filter(...)). Passing it to the exact related lookup wishlists__user=recommended_user raised, since Django 4.x: ValueError: The QuerySet value for an exact lookup must be limited to one result using slicing. Django 1.11 tolerated a QuerySet here, so /lists/recommended has returned HTTP 500 since the 1.11->4.2 cutover (2026-06-17). Fix: wishlists__user__in=recommended_user. Behavior-preserving for the intended single 'unglueit' user, and degrades gracefully (empty result, not 500) if that user is absent. Valid on both Django 4.2 and 5.2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
…et-lookup Fix /lists/recommended 500: use __in for QuerySet-valued lookup (fixes #1179)
…lean_email in django-registration 3.x) RegistrationFormNoDisposableEmail.clean_email called super().clean_email(), but django-registration 3.x removed clean_email from RegistrationForm/RegistrationFormUniqueEmail (unique-email check is now a field validator added in __init__). So every POST to /accounts/register/ raised: AttributeError: 'super' object has no attribute 'clean_email' Registration has been fully broken since the 1.11->4.2 cutover. Fix: read self.cleaned_data['email'] directly. Django's _clean_fields populates cleaned_data[name] (running field validators, incl. the unique and confusable-email checks) BEFORE calling clean_<name>, so the disposable check still runs on the already-validated value. Behavior-preserving; valid on django-registration 3.4 under both Django 4.2 and 5.2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
…-super Fix /accounts/register/ 500: read cleaned_data['email'] (django-registration 3.x) — fixes #1182
…1185) Acq/Campaign/UserProfile .objects.get(<int>) raised 'TypeError: cannot unpack non-iterable int object' (verified live) on every call, and the except DoesNotExist could not catch it. These tasks are actively invoked, so the features failed silently: - watermark_acq -> ebook watermarking on borrow/acquire - process_ebfs -> rights-holder ebook processing - ml_subscribe_task -> mailing-list subscribe Fix: .get(id=...). Pre-existing (not cutover-specific); surfaced by the post-cutover sweep. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
…(refs #1185) A1 MsgForm.full_clean: (1) used bare ValidationError, not imported in this module -> NameError -> 500; AND (2) it raised from an overridden full_clean(), which propagates out of is_valid() as a 500 even with a proper ValidationError (verified empirically). Fixed by using self.add_error(None, ...) so the form is marked invalid cleanly, and by catching ValueError/TypeError so a non-numeric supporter/ work id from POST doesn't crash the int lookup. Triggers on the 'message a supporter' POST (frontend/views:1722) with missing/invalid supporter or work. A2 EbookForm.set_provider: read self.cleaned_data['url'] unconditionally; when clean_url() raises (e.g. duplicate URL) that key is removed -> KeyError -> 500 on ebook add/edit with a duplicate URL. Use .get('url') and skip provider inference when url is absent so the field error is reported normally. Behavior-preserving for valid input; valid on Django 4.2 and 5.2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
core/tasks.py: positional .get() -> get(id=...) (3 tasks) — refs #1185
frontend/forms: stop 500s on invalid input (MsgForm, EbookForm) — refs #1185
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.