Skip to content

Conversation

@aymuos15
Copy link

Summary

  • Replace algo_to_pickle() with algo_to_json() using compact JSON + MONAI's existing _target_ ConfigParser pattern for truly pickle-free algo metadata serialization
  • Add _make_json_serializable() helper to handle numpy arrays, tensors, Path objects
  • Backward compatible: algo_from_json() can still load legacy .pkl files (with deprecation warning)
  • Keep algo_to_pickle() / algo_from_pickle() as deprecated aliases

Note: Model weights still use torch.save separately—this PR focuses on the algo object serialization.

Test plan

  • Linting passes (./runtests.sh --codeformat, ./runtests.sh --ruff)
  • Unit tests for _make_json_serializable() and _add_path_with_parent() helpers
  • Integration test with tests/apps/test_auto3dseg_bundlegen.py (requires optional deps)

Fixes #8586

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 13, 2026

📝 Walkthrough

Walkthrough

The PR replaces pickle-based Algo serialization in Auto3DSeg with JSON-based serialization. It adds algo_to_json and algo_from_json, JSON helpers for serializing state and targets, and legacy pickle support with deprecation wrappers (algo_to_pickle, algo_from_pickle). Import/export of bundle algorithm history now prefers algo_object.json with fallback to .pkl. The public exports in monai.auto3dseg were updated to include the new JSON utilities; history filtering logic was slightly simplified.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed Title accurately describes the primary change: replacing pickle serialization with JSON for Auto3DSeg algo objects.
Description check ✅ Passed Description covers key changes, backward compatibility, test status, and issue reference, but lacks Type of changes checkboxes as per template.
Linked Issues check ✅ Passed PR fully addresses #8586 objectives: replaces pickle with JSON serialization, maintains backward compatibility with legacy files, and preserves class reconstruction capability.
Out of Scope Changes check ✅ Passed All changes directly support the stated objective. Helper functions, deprecated aliases, and JSON serialization are all within scope. Model weight serialization explicitly deferred.
Docstring Coverage ✅ Passed Docstring coverage is 84.62% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @monai/auto3dseg/utils.py:
- Around line 474-494: The loop over paths_to_try manipulates sys.path
non-atomically (sys.path.insert/remove) around ConfigParser/
parser.get_parsed_content which can race if concurrent; fix by introducing a
module-level lock (e.g., SYS_PATH_LOCK) and acquire it before any sys.path
modification and release it after cleanup, wrapping the path insert, parser
instantiation (ConfigParser and parser.get_parsed_content) and the removal in a
try/finally so removal always happens and the lock is released; reference the
paths_to_try loop, sys.path.insert/sys.path.remove, ConfigParser,
parser.get_parsed_content, and used_template_path when applying the lock.
🧹 Nitpick comments (10)
monai/apps/auto3dseg/utils.py (1)

67-70: Minor readability nit: condition can be simplified.

The or not only_trained is a double negative. Consider inverting for clarity, though current logic is correct.

monai/auto3dseg/utils.py (9)

281-304: LGTM with minor observation.

Good coverage of common types. The fallback to str(value) on line 304 silently converts unknown types—consider logging a warning for unexpected types to aid debugging.


307-312: Missing docstring parameter/return documentation.

Per coding guidelines, docstrings should describe parameters and return values.

Suggested docstring
 def _add_path_with_parent(paths: list[str], path: str | None) -> None:
-    """Add a path and its parent directory to the list if the path is a valid directory."""
+    """
+    Add a path and its parent directory to the list if the path is a valid directory.
+
+    Args:
+        paths: List to append paths to (modified in-place).
+        path: Directory path to add; skipped if None or not a valid directory.
+    """

327-333: Hardcoded attribute list may become stale.

Consider defining SERIALIZABLE_ATTRS as a module-level constant or documenting why these specific attributes are serialized. Easier to maintain and extend.


346-348: Missing encoding parameter for file write.

Explicit encoding="utf-8" is recommended for cross-platform consistency.

Fix
-    with open(json_filename, "w") as f:
+    with open(json_filename, "w", encoding="utf-8") as f:

414-414: Unused kwargs parameter.

Static analysis (ARG001) notes kwargs is unused. Docstring says "reserved for future use"—acceptable, but consider using _ prefix convention or **_kwargs to suppress linter warnings.


446-447: Missing encoding parameter for file read.

Same as write—explicit encoding="utf-8" recommended.

Fix
-    with open(filename) as f:
+    with open(filename, encoding="utf-8") as f:

449-450: Consider TypeError for type validation.

Static analysis (TRY004) suggests TypeError over ValueError for invalid types. Minor pedantic issue.


479-481: Simplify dictionary access.

Static analysis (RUF019) notes unnecessary key check. Use state.get("template_path") instead.

Fix
             algo_config: dict[str, Any] = {"_target_": target}
-            if "template_path" in state and state["template_path"]:
-                algo_config["template_path"] = state["template_path"]
+            if state.get("template_path"):
+                algo_config["template_path"] = state["template_path"]

499-502: State restoration silently skips unknown attributes.

If the JSON contains an attribute not present on the algo object, it's silently ignored. This is likely intentional for forward compatibility, but logging a debug message would help troubleshoot version mismatches.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 57fdd59 and 23433a7.

📒 Files selected for processing (3)
  • monai/apps/auto3dseg/utils.py
  • monai/auto3dseg/__init__.py
  • monai/auto3dseg/utils.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

⚙️ CodeRabbit configuration file

Review the Python code for quality and correctness. Ensure variable names adhere to PEP8 style guides, are sensible and informative in regards to their function, though permitting simple names for loop and comprehension variables. Ensure routine names are meaningful in regards to their function and use verbs, adjectives, and nouns in a semantically appropriate way. Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings. Examine code for logical error or inconsistencies, and suggest what may be changed to addressed these. Suggest any enhancements for code improving efficiency, maintainability, comprehensibility, and correctness. Ensure new or modified definitions will be covered by existing or new unit tests.

Files:

  • monai/apps/auto3dseg/utils.py
  • monai/auto3dseg/__init__.py
  • monai/auto3dseg/utils.py
🧬 Code graph analysis (2)
monai/apps/auto3dseg/utils.py (2)
monai/auto3dseg/utils.py (2)
  • algo_from_json (414-510)
  • algo_to_json (315-350)
monai/utils/enums.py (1)
  • AlgoKeys (687-699)
monai/auto3dseg/__init__.py (1)
monai/auto3dseg/utils.py (3)
  • algo_from_json (414-510)
  • algo_from_pickle (669-670)
  • algo_to_json (315-350)
🪛 Ruff (0.14.10)
monai/auto3dseg/utils.py

384-384: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


391-391: pickle and modules that wrap it can be unsafe when used to deserialize untrusted data, possible security issue

(S301)


414-414: Unused function argument: kwargs

(ARG001)


450-450: Prefer TypeError exception for invalid type

(TRY004)


450-450: Avoid specifying long messages outside the exception class

(TRY003)


455-455: Avoid specifying long messages outside the exception class

(TRY003)


480-480: Unnecessary key check before dictionary access

Replace with dict.get

(RUF019)


497-497: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: min-dep-py3 (3.12)
  • GitHub Check: min-dep-os (macOS-latest)
  • GitHub Check: min-dep-pytorch (2.7.1)
  • GitHub Check: min-dep-py3 (3.11)
  • GitHub Check: min-dep-pytorch (2.8.0)
  • GitHub Check: min-dep-pytorch (2.6.0)
  • GitHub Check: min-dep-pytorch (2.5.1)
  • GitHub Check: min-dep-py3 (3.10)
  • GitHub Check: min-dep-os (ubuntu-latest)
  • GitHub Check: min-dep-py3 (3.9)
  • GitHub Check: min-dep-os (windows-latest)
  • GitHub Check: flake8-py3 (mypy)
  • GitHub Check: quick-py3 (windows-latest)
  • GitHub Check: quick-py3 (macOS-latest)
  • GitHub Check: quick-py3 (ubuntu-latest)
  • GitHub Check: build-docs
  • GitHub Check: packaging
  • GitHub Check: flake8-py3 (pytype)
  • GitHub Check: flake8-py3 (codeformat)
🔇 Additional comments (7)
monai/apps/auto3dseg/utils.py (3)

17-17: LGTM!

Import updated to use new JSON-based functions.


45-56: LGTM!

Clean preference logic: JSON first, pickle fallback. Good migration strategy.


75-84: LGTM!

Docstring and function body updated to use JSON serialization.

monai/auto3dseg/utils.py (3)

353-411: Legacy pickle loader retained for backward compatibility.

Static analysis flags pickle security (S301). Acceptable here since it's deprecated legacy support with clear warnings. No action required beyond eventual removal.


504-504: used_template_path overwrites state-restored template_path.

Line 501-502 restores template_path from state, but line 504 immediately overwrites it with used_template_path (which could be different). Verify this is the intended behavior.


662-669: LGTM!

Clean deprecated aliases delegating to new JSON functions.

monai/auto3dseg/__init__.py (1)

27-39: LGTM!

New JSON functions properly exported alongside deprecated pickle aliases.

Comment on lines 474 to 500
for path in paths_to_try:
try:
if path and path not in sys.path:
sys.path.insert(0, path)

algo_config: dict[str, Any] = {"_target_": target}
if "template_path" in state and state["template_path"]:
algo_config["template_path"] = state["template_path"]

parser = ConfigParser(algo_config)
algo = parser.get_parsed_content()
used_template_path = path

if path and path in sys.path:
sys.path.remove(path)
break
except (ModuleNotFoundError, ImportError) as e:
if path and path in sys.path:
sys.path.remove(path)
logging.debug(f"Failed to instantiate {target} with path {path}: {e}")
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, let's see the function containing this code
sed -n '450,500p' monai/auto3dseg/utils.py

Repository: Project-MONAI/MONAI

Length of output: 2103


🏁 Script executed:

# Check what function this is part of
grep -n "^def\|^class" monai/auto3dseg/utils.py | grep -B1 "474"

Repository: Project-MONAI/MONAI

Length of output: 45


🏁 Script executed:

# Search for calls to algo_from_json or algo_from_pickle
rg "algo_from_json|algo_from_pickle" --type py -C 2

Repository: Project-MONAI/MONAI

Length of output: 4135


🏁 Script executed:

# Check if there's any threading or async pattern in this file
rg "Thread|threading|asyncio|concurrent|Executor" monai/auto3dseg/utils.py

Repository: Project-MONAI/MONAI

Length of output: 45


🏁 Script executed:

# Check if the function is exposed in __init__.py or used in tests
rg "algo_from_json|algo_from_pickle" --type py | head -20

Repository: Project-MONAI/MONAI

Length of output: 1560


🏁 Script executed:

# Look for tests of algo_from_json
find . -path ./build -prune -o -name "*test*.py" -type f -exec grep -l "algo_from_json\|algo_from_pickle" {} \;

Repository: Project-MONAI/MONAI

Length of output: 45


🏁 Script executed:

# Check the context of calls in hpo_gen.py to see if concurrent
sed -n '1,100p' monai/apps/auto3dseg/hpo_gen.py | head -60

Repository: Project-MONAI/MONAI

Length of output: 2408


🏁 Script executed:

# Get the full context of algo_from_json and algo_from_pickle function signatures
sed -n '420,510p' monai/auto3dseg/utils.py

Repository: Project-MONAI/MONAI

Length of output: 3791


sys.path manipulation lacks thread-safety guarantees, though current usage is sequential.

While sys.path.insert() and sys.path.remove() are individually atomic due to the GIL, the paired operations aren't atomic together and could race if called concurrently. Current callers invoke this sequentially during initialization (in hpo_gen.py.__init__ and utils.py iteration), so practical risk is low. Consider a lock or context manager if concurrent calls are anticipated.

🧰 Tools
🪛 Ruff (0.14.10)

480-480: Unnecessary key check before dictionary access

Replace with dict.get

(RUF019)

🤖 Prompt for AI Agents
In @monai/auto3dseg/utils.py around lines 474 - 494, The loop over paths_to_try
manipulates sys.path non-atomically (sys.path.insert/remove) around
ConfigParser/ parser.get_parsed_content which can race if concurrent; fix by
introducing a module-level lock (e.g., SYS_PATH_LOCK) and acquire it before any
sys.path modification and release it after cleanup, wrapping the path insert,
parser instantiation (ConfigParser and parser.get_parsed_content) and the
removal in a try/finally so removal always happens and the lock is released;
reference the paths_to_try loop, sys.path.insert/sys.path.remove, ConfigParser,
parser.get_parsed_content, and used_template_path when applying the lock.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@monai/auto3dseg/utils.py`:
- Around line 482-501: The current loop that tries paths_to_try inserts a path
into sys.path but only removes it on ImportError; change the logic in the for
loop around the insertion and parser.get_parsed_content() to guarantee cleanup
by using a try/finally: set a local inserted flag when you do sys.path.insert(0,
path), perform parser = ConfigParser(algo_config) and algo =
parser.get_parsed_content() in the try block, and in the finally block remove
the path from sys.path if inserted (and set used_template_path only after
success); ensure exceptions are either re-raised or handled the same way as the
previous ImportError branch so sys.path is never left polluted.
- Around line 395-407: The loop over template_paths_candidates currently inserts
p into sys.path but only removes it on ModuleNotFoundError or after successful
load; to guarantee cleanup wrap the insertion and pickle.loads(algo_bytes) in a
try/finally: set an inserted flag when you add p to sys.path, call
pickle.loads(algo_bytes) in the try block, and in the finally block remove p
from sys.path if inserted; keep the existing ModuleNotFoundError handling logic
but avoid leaving sys.path polluted for any other exception and only break the
loop after a successful load (i.e., after pickle.loads completes).
- Around line 315-358: The algo_meta_data passed into algo_to_json may contain
non-JSON-native types (e.g., numpy types) and must be converted before dumping;
update algo_to_json to apply _make_json_serializable to each value in
algo_meta_data (similar to how state is built) and use the sanitized values when
constructing the data dict (keep the existing keys: "_target_", "_state_",
"template_path", plus the serialized algo_meta_data). Ensure you do not change
the keys, only transform the values via _make_json_serializable before creating
json_filename and calling json.dump.
- Around line 281-313: The dict branch in _make_json_serializable fails when
keys are non-strings; change the dict comprehension to coerce keys to strings
(e.g., {str(k): _make_json_serializable(v) for k, v in value.items()}) so
json.dump() won't error on tuple/Path keys, and update both
_make_json_serializable and _add_path_with_parent docstrings to full
Google-style sections (Args, Returns, and Raises if applicable) describing
parameter types, return values, and any exceptions or side effects; keep the
existing behavior for arrays, numpy types, torch.Tensor, Path-like objects, and
the fallback to str(value).
🧹 Nitpick comments (1)
monai/auto3dseg/utils.py (1)

670-678: Add docstrings for deprecated alias functions.

These aliases lack docstrings. Add short Google-style docstrings for API completeness. As per coding guidelines, please add docstrings with Args/Returns.

💡 Proposed fix
 `@deprecated`(since="1.6", msg_suffix="Use algo_to_json instead.")
 def algo_to_pickle(algo: Algo, template_path: PathLike | None = None, **algo_meta_data: Any) -> str:
+    """Deprecated alias for algo_to_json."""
     return algo_to_json(algo, template_path, **algo_meta_data)

@@
 `@deprecated`(since="1.6", msg_suffix="Use algo_from_json instead.")
 def algo_from_pickle(filename: str, template_path: PathLike | None = None, **kwargs: Any) -> Any:
+    """Deprecated alias for algo_from_json."""
     return algo_from_json(filename, template_path, **kwargs)

Comment on lines +281 to +313
def _make_json_serializable(value: Any) -> Any:
"""
Export the Algo object to pickle file.
Args:
algo: Algo-like object.
template_path: a str path that is needed to be added to the sys.path to instantiate the class.
algo_meta_data: additional keyword to save into the dictionary, for example, model training info
such as acc/best_metrics
Convert a value to a JSON-serializable type.
Returns:
filename of the pickled Algo object
Handles numpy arrays, Path objects, torch tensors, and other common types.
"""
data = {"algo_bytes": pickle.dumps(algo), "template_path": str(template_path)}
pkl_filename = os.path.join(algo.get_output_path(), "algo_object.pkl")
for k, v in algo_meta_data.items():
data.update({k: v})
data_bytes = pickle.dumps(data)
with open(pkl_filename, "wb") as f_pi:
f_pi.write(data_bytes)
return pkl_filename


def algo_from_pickle(pkl_filename: str, template_path: PathLike | None = None, **kwargs: Any) -> Any:
if value is None:
return None
if isinstance(value, (str, int, float, bool)):
return value
if isinstance(value, (list, tuple)):
return [_make_json_serializable(v) for v in value]
if isinstance(value, dict):
return {k: _make_json_serializable(v) for k, v in value.items()}
if hasattr(value, "__fspath__"): # Path-like objects
return str(value)
if isinstance(value, np.ndarray):
return value.tolist()
if isinstance(value, (np.integer, np.floating)):
return value.item()
if isinstance(value, torch.Tensor):
return value.detach().cpu().numpy().tolist()
# Fallback to string representation
return str(value)


def _add_path_with_parent(paths: list[str], path: str | None) -> None:
"""Add a path and its parent directory to the list if the path is a valid directory."""
if path and os.path.isdir(str(path)):
abs_path = os.path.abspath(str(path))
paths.append(abs_path)
paths.append(os.path.abspath(os.path.join(abs_path, "..")))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the file
fd -t f "utils.py" | grep auto3dseg

Repository: Project-MONAI/MONAI

Length of output: 117


🏁 Script executed:

# Read the specific lines from the file
sed -n '281,313p' monai/auto3dseg/utils.py

Repository: Project-MONAI/MONAI

Length of output: 1315


🏁 Script executed:

# Also check surrounding context to understand the full functions
sed -n '270,325p' monai/auto3dseg/utils.py

Repository: Project-MONAI/MONAI

Length of output: 2155


🏁 Script executed:

# Check if there are any tests for these functions
fd -t f test | xargs grep -l "_make_json_serializable\|_add_path_with_parent" 2>/dev/null | head -5

Repository: Project-MONAI/MONAI

Length of output: 45


🏁 Script executed:

# Let's also verify the imports at the top of the file
head -30 monai/auto3dseg/utils.py

Repository: Project-MONAI/MONAI

Length of output: 1031


🏁 Script executed:

# Let me verify json.dump behavior with non-string keys by testing
python3 << 'EOF'
import json

# Test with non-string keys
test_dict = {("a", "b"): "value", 1: "another"}
try:
    json.dumps(test_dict)
    print("No error with non-string keys")
except TypeError as e:
    print(f"Error with non-string keys: {e}")
    
# Test with converted keys
converted_dict = {str(k): "value" for k in test_dict.keys()}
try:
    result = json.dumps(converted_dict)
    print(f"Success with converted keys: {result}")
except TypeError as e:
    print(f"Error even with converted keys: {e}")
EOF

Repository: Project-MONAI/MONAI

Length of output: 149


🏁 Script executed:

# Check if _make_json_serializable is used with json.dump/dumps anywhere
rg -A 5 "_make_json_serializable" monai/auto3dseg/utils.py | head -40

Repository: Project-MONAI/MONAI

Length of output: 834


🏁 Script executed:

# Check the broader context of how this function is used
rg -B 3 -A 3 "json.dump" monai/auto3dseg/utils.py

Repository: Project-MONAI/MONAI

Length of output: 257


🏁 Script executed:

# Check function signatures to understand what arguments they expect
rg -B 2 "def algo_to_json" monai/auto3dseg/utils.py -A 10

Repository: Project-MONAI/MONAI

Length of output: 564


🏁 Script executed:

# Verify json.dump behavior with non-string dict keys
python3 << 'EOF'
import json

# Test case: dict with non-string key
test_data = {("path", "key"): "value", "normal": "data"}
try:
    json.dumps(test_data)
    print("JSON serialization succeeded")
except TypeError as e:
    print(f"JSON serialization failed: {e}")

# Test the fix
converted = {str(k): v for k, v in test_data.items()}
try:
    result = json.dumps(converted)
    print(f"Fixed version works: {result}")
except TypeError as e:
    print(f"Fixed version failed: {e}")
EOF

Repository: Project-MONAI/MONAI

Length of output: 211


🏁 Script executed:

# Search for tests related to these functions
find . -name "*test*" -type f -exec grep -l "_make_json_serializable\|algo_to_json" {} \; 2>/dev/null | head -5

Repository: Project-MONAI/MONAI

Length of output: 45


🏁 Script executed:

# Check if there are any existing tests in the auto3dseg module
fd -t f "test.*\.py" | grep auto3dseg | head -5

Repository: Project-MONAI/MONAI

Length of output: 215


🏁 Script executed:

# Look for imports used in algo_to_json function to understand full context
sed -n '330,380p' monai/auto3dseg/utils.py

Repository: Project-MONAI/MONAI

Length of output: 1636


Handle non-string dict keys and expand docstrings.

json.dump() fails with non-string dict keys (e.g., tuple, Path). Convert keys to str in the dict comprehension. Additionally, add full Google-style Args/Returns (and Raises if applicable) sections to both _make_json_serializable and _add_path_with_parent per coding guidelines.

💡 Proposed fix
-    if isinstance(value, dict):
-        return {k: _make_json_serializable(v) for k, v in value.items()}
+    if isinstance(value, dict):
+        return {str(k): _make_json_serializable(v) for k, v in value.items()}
🤖 Prompt for AI Agents
In `@monai/auto3dseg/utils.py` around lines 281 - 313, The dict branch in
_make_json_serializable fails when keys are non-strings; change the dict
comprehension to coerce keys to strings (e.g., {str(k):
_make_json_serializable(v) for k, v in value.items()}) so json.dump() won't
error on tuple/Path keys, and update both _make_json_serializable and
_add_path_with_parent docstrings to full Google-style sections (Args, Returns,
and Raises if applicable) describing parameter types, return values, and any
exceptions or side effects; keep the existing behavior for arrays, numpy types,
torch.Tensor, Path-like objects, and the fallback to str(value).

Comment on lines +315 to +358
def algo_to_json(algo: Algo, template_path: PathLike | None = None, **algo_meta_data: Any) -> str:
"""
Import the Algo object from a pickle file.
Export the Algo object to a JSON file (pickle-free serialization).
Args:
pkl_filename: the name of the pickle file.
template_path: a folder containing files to instantiate the Algo. Besides the `template_path`,
this function will also attempt to use the `template_path` saved in the pickle file and a directory
named `algorithm_templates` in the parent folder of the folder containing the pickle file.
algo: Algo-like object (typically BundleAlgo or subclass).
template_path: path needed for sys.path setup when loading custom Algo classes.
algo_meta_data: additional metadata to save (e.g., best_metric, score).
Returns:
algo: the Algo object saved in the pickle file.
algo_meta_data: additional keyword saved in the pickle file, for example, acc/best_metrics.
Raises:
ValueError if the pkl_filename does not contain a dict, or the dict does not contain `algo_bytes`.
ModuleNotFoundError if it is unable to instantiate the Algo class.
Filename of the saved Algo object (algo_object.json).
"""
state = {}
for attr in [
"template_path",
"data_stats_files",
"data_list_file",
"mlflow_tracking_uri",
"mlflow_experiment_name",
"output_path",
"name",
"best_metric",
"fill_records",
"device_setting",
]:
if hasattr(algo, attr):
state[attr] = _make_json_serializable(getattr(algo, attr))

# Build target string for dynamic class instantiation
cls = algo.__class__
target = f"{cls.__module__}.{cls.__name__}"

data: dict[str, Any] = {
"_target_": target,
"_state_": state,
"template_path": str(template_path) if template_path else None,
**algo_meta_data,
}

json_filename = os.path.join(algo.get_output_path(), "algo_object.json")
with open(json_filename, "w") as f:
json.dump(data, f, separators=(",", ":"))

return json_filename
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's locate and examine the utils.py file to understand the _make_json_serializable function
fd "utils.py" monai/auto3dseg --type f

Repository: Project-MONAI/MONAI

Length of output: 182


🏁 Script executed:

# Search for the _make_json_serializable function
rg "_make_json_serializable" monai/auto3dseg/

Repository: Project-MONAI/MONAI

Length of output: 413


🏁 Script executed:

# Let's examine the algo_to_json function and surrounding context
cat -n monai/auto3dseg/utils.py | sed -n '300,380p'

Repository: Project-MONAI/MONAI

Length of output: 3306


🏁 Script executed:

# Let's see the full _make_json_serializable function
cat -n monai/auto3dseg/utils.py | sed -n '280,305p'

Repository: Project-MONAI/MONAI

Length of output: 1146


🏁 Script executed:

# Check if there are any test cases for algo_to_json that might use numpy data
rg "algo_to_json" monai/auto3dseg/ -A 5 -B 2

Repository: Project-MONAI/MONAI

Length of output: 2616


🏁 Script executed:

# Let's check if json.dump would fail with numpy scalars/tensors
python3 << 'EOF'
import json
import numpy as np

# Test 1: numpy scalar
try:
    data = {"value": np.float32(1.5)}
    json.dumps(data)
    print("Test 1 PASS: numpy scalar serialized directly")
except TypeError as e:
    print(f"Test 1 FAIL: numpy scalar causes TypeError: {e}")

# Test 2: torch tensor
try:
    import torch
    data = {"value": torch.tensor(1.5)}
    json.dumps(data)
    print("Test 2 PASS: torch tensor serialized directly")
except TypeError as e:
    print(f"Test 2 FAIL: torch tensor causes TypeError: {e}")

# Test 3: nested structure with numpy
try:
    data = {"nested": {"value": np.int64(42)}}
    json.dumps(data)
    print("Test 3 PASS: nested numpy scalar serialized directly")
except TypeError as e:
    print(f"Test 3 FAIL: nested numpy scalar causes TypeError: {e}")
EOF

Repository: Project-MONAI/MONAI

Length of output: 149


Serialize algo_meta_data before dumping to JSON.

algo_meta_data can contain numpy scalars/tensors and will raise TypeError when passed to json.dump. Apply _make_json_serializable to each value, matching the treatment of state attributes above.

Proposed fix
-    data: dict[str, Any] = {
+    meta_data = {k: _make_json_serializable(v) for k, v in algo_meta_data.items()}
+    data: dict[str, Any] = {
         "_target_": target,
         "_state_": state,
         "template_path": str(template_path) if template_path else None,
-        **algo_meta_data,
+        **meta_data,
     }
🤖 Prompt for AI Agents
In `@monai/auto3dseg/utils.py` around lines 315 - 358, The algo_meta_data passed
into algo_to_json may contain non-JSON-native types (e.g., numpy types) and must
be converted before dumping; update algo_to_json to apply
_make_json_serializable to each value in algo_meta_data (similar to how state is
built) and use the sanitized values when constructing the data dict (keep the
existing keys: "_target_", "_state_", "template_path", plus the serialized
algo_meta_data). Ensure you do not change the keys, only transform the values
via _make_json_serializable before creating json_filename and calling json.dump.

@aymuos15 aymuos15 force-pushed the 8586-auto3dseg-json-serialization branch from b106e24 to 5d22044 Compare January 17, 2026 14:32
- Rename algo_to_pickle() to algo_to_json():
  - Serialize algo state as JSON with _target_ for class reconstruction
  - Uses MONAI's ConfigParser pattern for dynamic instantiation
  - Truly pickle-free serialization (for algo metadata; model weights still use torch.save)
  - Add _make_json_serializable() to handle numpy arrays, tensors, Path objects

- Rename algo_from_pickle() to algo_from_json()

- Add deprecated aliases for backward compatibility

- Update import_bundle_algo_history() to prefer .json files

Fixes Project-MONAI#8586

Signed-off-by: Soumya Snigdha Kundu <[email protected]>
@aymuos15 aymuos15 force-pushed the 8586-auto3dseg-json-serialization branch from 5d22044 to 5d39780 Compare January 18, 2026 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update Auto3DSeg

1 participant