11"""
2- URL shortener for .htaccess redirects
2+ # URL shortener for .htaccess redirects
3+
4+ This script reads a `.htaccess` file and a plain text file with
5+ URLs (the target URLs).
6+
7+ It outputs a list of target URLs and their corresponding short URLs,
8+ made from paths in the FPI.LI domain like `/2d`, `/2e`, etc.
9+ This list is used to replace the target URLs with short URLs
10+ in the `.adoc` files where the target URLs are used.
11+
12+ If a target URL is not in the `.htaccess` file,
13+ the script generates a new short URL
14+ and appends a new `RedirectTemp` directive to the `.htaccess` file.
15+
16+
17+ ## `.httaccess` file
18+
19+ A file named `.htaccess` in this format is deployed to the web server
20+ at FPY.LI to redirect short URLs to target URLs (the longer ones).
21+
22+ ```
23+ # added: 2025-05-26 16:01:24
24+ RedirectTemp /2d https://mitpress.mit.edu/9780262111584/the-art-of-the-metaobject-protocol/
25+ RedirectTemp /2e https://dabeaz.com/per.html
26+ RedirectTemp /2f https://pythonfluente.com/2/#iter_closer_look
27+
28+ ```
29+
30+ When a user agent requests a URL like `https://fpy.li/2d`,
31+ the web server responds with a 302 redirect to the longer URL
32+ `https://mitpress.mit.edu/9780262111584/the-art-of-the-metaobject-protocol/`.
33+
34+ A temporary redirect (code 302)
35+ tells user agents to come back to the same URL at FPY.LI later,
36+ and not update their bookmark.
37+ This allows me update the target URL, if needed.
38+
39+ ## Redirects in memory
40+
41+ The `redirects` dict maps short paths to target URLs.
42+ It's loaded from data in the `.htaccess` file.
43+
44+ ## Targets in memory
45+
46+ The `targets` dict maps target URLs to short paths.
47+ It's also loaded from data in the `.htaccess` file,
48+ but the algorithm is more complicated.
49+
50+ The same target URL can be mapped to multiple short paths
51+ due to past mistakes when updating the `.htaccess` file.
52+
53+ When loading the `.htaccess` file,
54+ if a target URL is already in the `targets` dict,
55+ we compare the existing short path with the new one
56+ and save the shorter one in the `targets` dict.
57+
58+ That way, we ensure that the shortest path is used for each target URL
59+ in the list of replacements we output to apply to the `.adoc` files.
60+
61+
62+ ## Shortening URLs
63+
64+ The `targets` dict maps target URLs to short paths.
65+
66+ To shorten a target URL, find it in the `targets` dict.
67+ If the target URL is found:
68+ use the existing path.
69+ If the target URL is not found:
70+ generate a new short path;
71+ store target and path in both `targets` and `redirects` dicts;
72+ collect new short path and target URL in a `new_redirects` list
73+ to be appended to the `.htaccess` file later.
74+ Targets in memory
75+
76+ To avoid generating a new short URL for a target URL,
77+
78+
79+
80+ the `shortener` module provides a way to generate new short URLs
81+
382
483Procedure:
584
@@ -19,11 +98,11 @@ def parse_htaccess(text: str) -> Iterator[tuple[str, str]]:
1998 fields = line .split ()
2099 if len (fields ) < 3 or fields [0 ] != 'RedirectTemp' :
21100 continue
22- key = fields [1 ]
23- assert key [0 ] == '/'
24- key = key [1 :]
25- assert len (key ) > 0
26- yield (key , fields [2 ])
101+ path = fields [1 ]
102+ assert path [0 ] == '/' , f'Missing /: { path !r } '
103+ path = path [1 :] # Remove leading slash
104+ assert len (path ) > 0 , f'Root path in line { line !r } '
105+ yield (path , fields [2 ])
27106
28107
29108def choose (a : str , b : str ) -> str :
0 commit comments