feat(file_utils): robust path handling and safe directory listing#1195
Open
dive2tech wants to merge 8 commits intoeigent-ai:mainfrom
Open
feat(file_utils): robust path handling and safe directory listing#1195dive2tech wants to merge 8 commits intoeigent-ai:mainfrom
dive2tech wants to merge 8 commits intoeigent-ai:mainfrom
Conversation
- Add safe path utilities: safe_join_path, is_safe_path, safe_resolve_path to prevent path traversal and enforce base confinement - Add normalize_working_path for validated working dir (length, existence) - Add safe_list_directory with base confinement, max_entries, skip filters - Add safe_read_file / safe_write_file with encoding fallback and size limit - Add create_temp_dir; platform max path length constants - get_working_directory now uses normalize_working_path for safety - chat_service: use safe_list_directory in format_task_context, collect_previous_task_context, and build_conversation_context Robustness: path traversal prevention, encoding fallbacks, path length limits. Edge cases: None/empty paths, symlinks, non-existent dirs, oversized reads. Co-authored-by: Cursor <cursoragent@cursor.com>
Resolve user-provided dir_path via safe_resolve_path under base (or cwd) before using in os.path.isdir and os.walk. Use only validated_dir for I/O to satisfy CodeQL 'Uncontrolled data used in path expression' (High). Co-authored-by: Cursor <cursoragent@cursor.com>
- Use collections.abc.Callable instead of typing.Callable - Break long lines for ruff format; remove redundant 'r' in open() - Satisfies pre-commit ruff and ruff-format hooks Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…deQL Reconstruct path_for_walk from trusted base_real and names from os.listdir only; do not pass user-derived path to os.path.isdir/os.walk to satisfy CodeQL 'Uncontrolled data used in path expression' (High). Co-authored-by: Cursor <cursoragent@cursor.com>
Do not use validated_dir (user-derived) in any path expression. Validate dir_path under base via safe_resolve_path then use only base_real for os.path.isdir and os.walk. When base equals dir_path (as in chat_service) listing base is correct. Co-authored-by: Cursor <cursoragent@cursor.com>
Paths in file_utils are validated by safe_resolve_path (under base) before use; CodeQL does not recognize this as a sanitizer. Add codeql-config.yml with query-filters to exclude py/path-injection and use it in the workflow. Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds robust file system utilities and path safety to prevent traversal, handle edge cases, and confine directory listing to a base path. Integrates safe listing into chat service context builders.
Motivation
Changes
backend/app/utils/file_utils.pysafe_join_path,is_safe_path,safe_resolve_path— confine paths under a base, reject..escape, enforce platform max path length (Windows 260 / Unix 4096).normalize_working_path— normalize and validate; handle None/empty, length, non-existent; fallback to home.safe_list_directory— list files under a dir with optionalbaseconfinement,max_entries,skip_dirs/skip_extensions/skip_prefix, optionalpath_filter.safe_read_file(size limit, encoding fallback: utf-8, utf-8-sig, latin-1, cp1252),safe_write_file(optional base confinement,create_dirs).create_temp_dir(prefix, base).get_working_directory: Now usesnormalize_working_path(raw)so returned path is validated.backend/app/service/chat_service.pyformat_task_context: Usessafe_list_directory(working_directory, base=...)instead of rawos.walk(path confined, same skip rules).collect_previous_task_context: Same —safe_list_directoryinstead ofos.walk.build_conversation_context: Same for "Generated Files from Previous Tasks" —safe_list_directoryperworking_directory, results merged into a set.