-
Notifications
You must be signed in to change notification settings - Fork 324
fix(utils.py): specify utf-8 encoding when reading file #1268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Can you provide an example of when this fails today and the complete traceback? |
Yes, this is an error that occurs in Japanese environments. Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "C:\Program Files\Python311\Scripts\twine.exe\__main__.py", line 7, in <module>
File "C:\Users\msi-z\AppData\Roaming\Python\Python311\site-packages\twine\__main__.py", line 33, in main
error = cli.dispatch(sys.argv[1:])
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\msi-z\AppData\Roaming\Python\Python311\site-packages\twine\cli.py", line 139, in dispatch
return main(args.args)
^^^^^^^^^^^^^^^
File "C:\Users\msi-z\AppData\Roaming\Python\Python311\site-packages\twine\commands\upload.py", line 250, in main
upload_settings = settings.Settings.from_argparse(parsed_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\msi-z\AppData\Roaming\Python\Python311\site-packages\twine\settings.py", line 288, in from_argparse
return cls(**settings)
^^^^^^^^^^^^^^^
File "C:\Users\msi-z\AppData\Roaming\Python\Python311\site-packages\twine\settings.py", line 116, in __init__
self._handle_repository_options(
File "C:\Users\msi-z\AppData\Roaming\Python\Python311\site-packages\twine\settings.py", line 304, in _handle_repository_options
self.repository_config = utils.get_repository_from_config(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\msi-z\AppData\Roaming\Python\Python311\site-packages\twine\utils.py", line 154, in get_repository_from_config
config = get_config(config_file)[repository]
^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\msi-z\AppData\Roaming\Python\Python311\site-packages\twine\utils.py", line 66, in get_config
parser.read_file(f)
File "C:\Program Files\Python311\Lib\configparser.py", line 734, in read_file
self._read(f, source)
File "C:\Program Files\Python311\Lib\configparser.py", line 1037, in _read
for lineno, line in enumerate(fp, start=1):
UnicodeDecodeError: 'cp932' codec can't decode byte 0x88 in position 418: illegal multibyte sequence``` |
I made minimal changes based on this error. |
Specifically, it is as follows.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @harumaki4649! It's a bummer that this kind of system codec stuff still causes issues on Windows hosts, but I see no problem with explicitly requiring UTF-8 in the config here based on your explanation.
I've only been hesitant as I expect there may be some person somewhere using utf-16/utf-32 as rare as that is for whom this would break |
Ah yeah, that's a good point. Maybe what we could do here is attempt the read twice: once as the native codepage and then again with UTF-8 if the first fails? (Or do the read once, as bytes, and attempt decoding from bytes twice.) |
Yeah, that just feels not great in both cases |
@woodruffw Should we change it to use UTF-8 as a fallback when errors occur? |
@harumaki4649 yes. I'd catch a specific exception and then retry with utf-8 |
I’ve applied the improvements based on your review comments. |
Refactor configuration file parsing to use helper functions for better error handling and readability.
Refactor error handling for configuration file parsing.
Based on the feedback we received again, we have made further revisions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now all we could use are some tests to trip this.
@harumaki4649 really all you need to do is provide an ini file that has a character outside of |
tests: add UTF-8 fallback tests for parsing config
Update logging level for caplog in tests to specify the logger.
I ran the tests in my local environment. I'm not entirely sure whether these are the tests you expected, but would this be acceptable? This was my first time performing this type of task, so I apologize in advance if I made any mistakes. Is there anything else you would like me to check? |
Summary
Fixed file reading error in utils.py by explicitly specifying UTF-8 encoding.
Details
encoding="utf-8"
to file open functionImpact