-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v2.4.13: 跟进网页端的正则表达式,完善【多图合并pdf】插件的一些细节。 #187
Conversation
WalkthroughThe recent updates across various modules of the jmcomic project involve minor version increment, enhancements in client interface handling, directory management, plugin functionality, and toolkit adjustments. These changes improve the software's robustness in image processing and file management, refine environmental variable handling, and adjust HTML parsing for album views. Changes
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChat with CodeRabbit Bot (
|
To prevent beginners from mistakenly submitting PRs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 7
Configuration used: CodeRabbit UI
Files selected for processing (7)
- src/jmcomic/init.py (1 hunks)
- src/jmcomic/jm_client_interface.py (1 hunks)
- src/jmcomic/jm_option.py (4 hunks)
- src/jmcomic/jm_plugin.py (8 hunks)
- src/jmcomic/jm_toolkit.py (2 hunks)
- usage/workflow_download.py (1 hunks)
- usage/workflow_export_favorites.py (1 hunks)
Files skipped from review due to trivial changes (2)
- src/jmcomic/init.py
- usage/workflow_download.py
Additional comments: 9
usage/workflow_export_favorites.py (1)
- 4-12: The addition of the
env
function withinprepare_actions_input_and_secrets
is a good use of nested functions for encapsulation. Ensure that the environment variable substitution logic is tested thoroughly, especially the fallback mechanism toIN_{name}
when the primary environment variable is not set.src/jmcomic/jm_client_interface.py (1)
- 71-74: The modification to strip query parameters from
img_url
in thetransfer_to
method is a good practice to ensure that the URL is processed correctly. Verify that this change does not affect any other parts of the system that may rely on the full URL, including query parameters.Verification successful
The previous script contained an error in the usage of the
rg
command. I will correct the script to search for thetransfer_to
method in Python files using the correct file type identifier.
The search results indicate that the
transfer_to
method is defined and called within thejm_client_interface.py
file. There are no other occurrences of this method being called in other Python files within the codebase, which suggests that the change to strip query parameters fromimg_url
is localized to this file and method.Based on the information provided:
- The
transfer_to
method is defined injm_client_interface.py
.- The same file contains a call to
transfer_to
, passingimg_url
as an argument.Since there are no other usages of
transfer_to
found in the codebase, it appears that the change to strip query parameters fromimg_url
does not affect other parts of the system. However, without a broader context of howimg_url
is used throughout the application, it's not possible to definitively conclude that no other parts of the system rely on the full URL, including query parameters.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify that the change to strip query parameters from img_url does not affect other system parts. # This script should search for other usages of the transfer_to method and ensure they expect the modified behavior. rg --type python "transfer_to"Length of output: 61
Script:
#!/bin/bash # Search for other usages of the transfer_to method in Python files. rg "transfer_to" --type pyLength of output: 198
src/jmcomic/jm_plugin.py (5)
361-370: The
after_zip
method now callsexecute_deletion
to remove image paths and directories. Ensure that the deletion order is correct and that no necessary files are deleted.682-686: The
delete_original_file
attribute is set based on the parameter, and thecheck_image_suffix_is_valid
method is called to validate the image suffix. This is a good use of the new method to enforce consistency.702-711: The command generation in
generate_cmd
uses string formatting, which could be vulnerable to command injection if user input is not properly sanitized. Ensure that all inputs used in the command are sanitized or come from trusted sources.726-736: The deletion logic in
invoke
method ofConvertJpgToPdfPlugin
should be verified to ensure that it only deletes the intended files and that the paths are correctly determined.582-582: The
execute_deletion
method is used to delete files after zipping. Ensure that this is the intended behavior and that no necessary files are deleted as a result.src/jmcomic/jm_toolkit.py (2)
54-54: The regular expression
pattern_html_album_views
has been updated to include an additional<span>
tag. This change affects how album views are parsed from HTML. Ensure that this change aligns with the expected HTML structure for album views.294-299: The
tokenize
method now uses thefind_right_pair
function to find the matching closing character for brackets. This is a logical change that should improve the accuracy of tokenization. Ensure that theExceptionTool.require_true
call on line 295 is sufficient for error handling whenfind_right_pair
returns -1, indicating an unmatched character.
def decide_image_save_dir(self, photo, ensure_exists=True) -> str: | ||
# 使用 self.dir_rule 决定 save_dir | ||
save_dir = self.dir_rule.decide_image_save_dir( | ||
photo.from_album, | ||
photo | ||
) | ||
|
||
mkdir_if_not_exists(save_dir) | ||
if ensure_exists: | ||
mkdir_if_not_exists(save_dir) | ||
|
||
return save_dir |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of the ensure_exists
parameter to decide_image_save_dir
method is a good enhancement for ensuring the existence of the directory before saving files. However, the mkdir_if_not_exists
function is called without any error handling. If directory creation fails, it could cause unhandled exceptions during runtime.
- mkdir_if_not_exists(save_dir)
+ try:
+ mkdir_if_not_exists(save_dir)
+ except OSError as e:
+ # Handle exception, possibly log it, and decide on further action
+ # (e.g., re-raise, return a default value, etc.)
Committable suggestion
❗ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
def decide_image_save_dir(self, photo, ensure_exists=True) -> str: | |
# 使用 self.dir_rule 决定 save_dir | |
save_dir = self.dir_rule.decide_image_save_dir( | |
photo.from_album, | |
photo | |
) | |
mkdir_if_not_exists(save_dir) | |
if ensure_exists: | |
mkdir_if_not_exists(save_dir) | |
return save_dir | |
def decide_image_save_dir(self, photo, ensure_exists=True) -> str: | |
# 使用 self.dir_rule 决定 save_dir | |
save_dir = self.dir_rule.decide_image_save_dir( | |
photo.from_album, | |
photo | |
) | |
if ensure_exists: | |
try: | |
mkdir_if_not_exists(save_dir) | |
except OSError as e: | |
# Handle exception, possibly log it, and decide on further action | |
# (e.g., re-raise, return a default value, etc.) | |
return save_dir |
def fix_kwargs(self, kwargs: Optional[Dict]) -> Dict[str, Any]: | ||
""" | ||
kwargs将来要传给方法参数,这要求kwargs的key是str类型, | ||
该方法检查kwargs的key的类型,如果不是str,尝试转为str,不行则抛异常。 | ||
""" | ||
ExceptionTool.require_true( | ||
isinstance(kwargs, dict), | ||
f'插件的kwargs参数必须为dict类型,而不能是类型: {type(kwargs)}' | ||
) | ||
if kwargs is None: | ||
kwargs = {} | ||
else: | ||
ExceptionTool.require_true( | ||
isinstance(kwargs, dict), | ||
f'插件的kwargs参数必须为dict类型,而不能是类型: {type(kwargs)}' | ||
) | ||
|
||
kwargs: dict | ||
new_kwargs: Dict[str, Any] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix_kwargs
method now initializes kwargs
to an empty dictionary if it is None
. This change ensures that the method can handle None
values gracefully. The type check for kwargs
being a dictionary is also a good practice to avoid runtime type errors. However, the method could be simplified by using a default value for kwargs
in the method signature.
- def fix_kwargs(self, kwargs: Optional[Dict]) -> Dict[str, Any]:
+ def fix_kwargs(self, kwargs: Optional[Dict] = {}) -> Dict[str, Any]:
- if kwargs is None:
- kwargs = {}
Committable suggestion
❗ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
def fix_kwargs(self, kwargs: Optional[Dict]) -> Dict[str, Any]: | |
""" | |
kwargs将来要传给方法参数,这要求kwargs的key是str类型, | |
该方法检查kwargs的key的类型,如果不是str,尝试转为str,不行则抛异常。 | |
""" | |
ExceptionTool.require_true( | |
isinstance(kwargs, dict), | |
f'插件的kwargs参数必须为dict类型,而不能是类型: {type(kwargs)}' | |
) | |
if kwargs is None: | |
kwargs = {} | |
else: | |
ExceptionTool.require_true( | |
isinstance(kwargs, dict), | |
f'插件的kwargs参数必须为dict类型,而不能是类型: {type(kwargs)}' | |
) | |
kwargs: dict | |
new_kwargs: Dict[str, Any] = {} | |
def fix_kwargs(self, kwargs: Optional[Dict] = {}) -> Dict[str, Any]: | |
""" | |
kwargs将来要传给方法参数,这要求kwargs的key是str类型, | |
该方法检查kwargs的key的类型,如果不是str,尝试转为str,不行则抛异常。 | |
""" | |
else: | |
ExceptionTool.require_true( | |
isinstance(kwargs, dict), | |
f'插件的kwargs参数必须为dict类型,而不能是类型: {type(kwargs)}' | |
) | |
kwargs: dict | |
new_kwargs: Dict[str, Any] = {} |
def find_right_pair(left_pair, i): | ||
stack = [left_pair] | ||
j = i + 1 | ||
|
||
while j < length and len(stack) != 0: | ||
c = title[j] | ||
if c in bracket_map: | ||
stack.append(c) | ||
elif c == bracket_map[stack[-1]]: | ||
stack.pop() | ||
|
||
j += 1 | ||
|
||
if len(stack) == 0: | ||
return j | ||
else: | ||
return -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The find_right_pair
function has been introduced to locate the correct closing character for a given opening character. This function uses a stack to match pairs of characters, which is a common and effective algorithm for this purpose. However, there is no error handling if the closing character is not found (i.e., find_right_pair
returns -1). It would be beneficial to add error handling to provide more informative feedback in such cases.
def find_right_pair(left_pair, i):
stack = [left_pair]
j = i + 1
while j < length and len(stack) != 0:
c = title[j]
if c in bracket_map:
stack.append(c)
elif c == bracket_map[stack[-1]]:
stack.pop()
j += 1
if len(stack) == 0:
return j
+ else:
+ raise ValueError(f"Unmatched character {left_pair} at position {i}")
Committable suggestion
❗ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
def find_right_pair(left_pair, i): | |
stack = [left_pair] | |
j = i + 1 | |
while j < length and len(stack) != 0: | |
c = title[j] | |
if c in bracket_map: | |
stack.append(c) | |
elif c == bracket_map[stack[-1]]: | |
stack.pop() | |
j += 1 | |
if len(stack) == 0: | |
return j | |
else: | |
return -1 | |
def find_right_pair(left_pair, i): | |
stack = [left_pair] | |
j = i + 1 | |
while j < length and len(stack) != 0: | |
c = title[j] | |
if c in bracket_map: | |
stack.append(c) | |
elif c == bracket_map[stack[-1]]: | |
stack.pop() | |
j += 1 | |
if len(stack) == 0: | |
return j | |
else: | |
raise ValueError(f"Unmatched character {left_pair} at position {i}") |
src/jmcomic/jm_plugin.py
Outdated
@@ -18,6 +18,7 @@ class JmOptionPlugin: | |||
def __init__(self, option: JmOption): | |||
self.option = option | |||
self.log_enable = True | |||
self.delete_original_file = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The delete_original_file
attribute is initialized to None
, which is unconventional for a boolean flag. Consider initializing it to False
for clarity.
- self.delete_original_file = None
+ self.delete_original_file = False
Committable suggestion
❗ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
self.delete_original_file = None | |
self.delete_original_file = False |
src/jmcomic/jm_plugin.py
Outdated
def execute_deletion(self, paths: List[str]): | ||
if self.delete_original_file is not True: | ||
return | ||
|
||
for p in paths: | ||
if file_not_exists(p): | ||
continue | ||
|
||
if os.path.isdir(p): | ||
os.remove(p) | ||
self.log(f'删除文件夹: {p}', 'remove') | ||
else: | ||
os.rmdir(p) | ||
self.log(f'删除原文件: {p}', 'remove') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The execute_deletion
method uses os.remove
to delete directories and os.rmdir
to delete files, which is incorrect. The correct function to remove directories is os.rmdir
and for files, it's os.remove
.
- if os.path.isdir(p):
- os.remove(p)
+ if os.path.isfile(p):
+ os.remove(p)
- else:
- os.rmdir(p)
+ elif os.path.isdir(p):
+ os.rmdir(p)
Committable suggestion
❗ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
def execute_deletion(self, paths: List[str]): | |
if self.delete_original_file is not True: | |
return | |
for p in paths: | |
if file_not_exists(p): | |
continue | |
if os.path.isdir(p): | |
os.remove(p) | |
self.log(f'删除文件夹: {p}', 'remove') | |
else: | |
os.rmdir(p) | |
self.log(f'删除原文件: {p}', 'remove') | |
def execute_deletion(self, paths: List[str]): | |
if self.delete_original_file is not True: | |
return | |
for p in paths: | |
if file_not_exists(p): | |
continue | |
if os.path.isfile(p): | |
os.remove(p) | |
self.log(f'删除原文件: {p}', 'remove') | |
elif os.path.isdir(p): | |
os.rmdir(p) | |
self.log(f'删除文件夹: {p}', 'remove') |
def check_image_suffix_is_valid(self, std_suffix): | ||
""" | ||
检查option配置的图片后缀转换,目前限制使用Magick时只能搭配jpg | ||
暂不探究Magick是否支持更多图片格式 | ||
""" | ||
cur_suffix: Optional[str] = self.option.download.image.suffix | ||
|
||
ExceptionTool.require_true( | ||
cur_suffix is not None and cur_suffix.endswith(std_suffix), | ||
'请把图片的后缀转换配置为jpg,不然无法使用Magick!' | ||
f'(当前配置是[{cur_suffix}])\n' | ||
f'配置模板如下: \n' | ||
f'```\n' | ||
f'download:\n' | ||
f' image:\n' | ||
f' suffix: {std_suffix} # 当前配置是{cur_suffix}\n' | ||
f'```' | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check_image_suffix_is_valid
method throws an exception if the image suffix is not set to .jpg
. This is a restrictive design choice that may not be flexible for future requirements. Consider if this restriction is necessary or if it could be made more flexible.
src/jmcomic/jm_plugin.py
Outdated
cmd = f'7z a "{self.zip_filepath}" "{self.save_dir}" -p{self.zip_password} -mhe=on > output_7z.txt' | ||
self.require_true( | ||
0 == os.system(cmd), | ||
'加密压缩文件失败' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using os.system
to execute a command can be a security risk if the command includes user-supplied data. Consider using the subprocess
module with a list of arguments to mitigate the risk of shell injection.
- os.system(cmd)
+ import subprocess
+ subprocess.run(cmd, shell=True, check=True)
Committable suggestion
❗ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
cmd = f'7z a "{self.zip_filepath}" "{self.save_dir}" -p{self.zip_password} -mhe=on > output_7z.txt' | |
self.require_true( | |
0 == os.system(cmd), | |
'加密压缩文件失败' | |
import subprocess | |
cmd = f'7z a "{self.zip_filepath}" "{self.save_dir}" -p{self.zip_password} -mhe=on > output_7z.txt' | |
self.require_true( | |
0 == subprocess.run(cmd, shell=True, check=True), | |
'加密压缩文件失败' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review Status
Actionable comments generated: 0
Configuration used: CodeRabbit UI
Files selected for processing (1)
- src/jmcomic/jm_plugin.py (7 hunks)
Files skipped from review as they are similar to previous changes (1)
- src/jmcomic/jm_plugin.py
Summary by CodeRabbit
New Features
Improvements
Bug Fixes
Refactor
Documentation