Skip to content

Update python setup logic #111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 33 additions & 20 deletions Zarr.m
Original file line number Diff line number Diff line change
Expand Up @@ -17,30 +17,43 @@
isRemote
end


methods(Static)
function pySetup
% Load the Python library
% Set up Python path

% Python module setup and bootstrapping to MATLAB
fullPath = mfilename('fullpath');
zarrDirectory = fileparts(fullPath);
modpath = fullfile(zarrDirectory, 'PythonModule');
% Add the current folder to the Python search path
if count(py.sys.path,modpath) == 0
insert(py.sys.path,int32(0),modpath);
end

% Check if the ZarrPy module is loaded already. If not, load
% it.
sys = py.importlib.import_module('sys');
loadedModules = dictionary(sys.modules);
if ~loadedModules.isKey("ZarrPy")
zarrModule = py.importlib.import_module('ZarrPy');
py.importlib.reload(zarrModule);
zarrPyPath = fullfile(zarrDirectory, 'PythonModule');
% Add ZarrPy to the Python search path if it is not there
% already
if count(py.sys.path,zarrPyPath) == 0
insert(py.sys.path,int32(0),zarrPyPath);
end
end

function zarrPy = ZarrPy()
% Get ZarrPy Python module

% Python will compile and cache the module after the first call
% to import_module, so there is no harm in making this call
% multiple times.
zarrPy = py.importlib.import_module('ZarrPy');
end

function pyReloadInProcess()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this called?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NM, I missed the tests. Is it only for the tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for development convenience. If Python is in-process and we modify the Python code, this method can be used after clear classes to reload the Python module to pick up the code changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhhh- you could do something like having a constant hidden property that cleans up and executes that function so clear classes would do this automagically.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've discussed different approaches with external interface team and didn't arrive to anything much cleaner, so time-boxed it to this solution. But will follow up with you offline to see if we can improve it.

% Reload ZarrPy module after it has been modified (for
% In-Process Python only). Need to do `clear classes` before
% this call. For Out-of-Process Python, can just use
% `terminate(pyenv)` instead.

% make sure the python module is on the path
Zarr.pySetup()

% reload
py.importlib.reload(Zarr.ZarrPy);
end

function isZarray = isZarrArray(path)
% Given a path, determine if it is a Zarr array

Expand Down Expand Up @@ -262,7 +275,7 @@ function makeZarrGroups(existingParentPath, newGroupsPath)
% Extract the S3 bucket name and path
[bucketName, objectPath] = Zarr.extractS3BucketNameAndPath(obj.Path);
% Create a Python dictionary for the KV store driver
obj.KVStoreSchema = py.ZarrPy.createKVStore(obj.isRemote, objectPath, bucketName);
obj.KVStoreSchema = Zarr.ZarrPy.createKVStore(obj.isRemote, objectPath, bucketName);

else % Local file
% Use full path
Expand All @@ -272,7 +285,7 @@ function makeZarrGroups(existingParentPath, newGroupsPath)
error("MATLAB:Zarr:invalidPath",...
"Unable to access path ""%s"".", path)
end
obj.KVStoreSchema = py.ZarrPy.createKVStore(obj.isRemote, obj.Path);
obj.KVStoreSchema = Zarr.ZarrPy.createKVStore(obj.isRemote, obj.Path);
end
end

Expand Down Expand Up @@ -310,7 +323,7 @@ function makeZarrGroups(existingParentPath, newGroupsPath)
endInds = start + stride.*count;

% Read the data
ndArrayData = py.ZarrPy.readZarr(obj.KVStoreSchema,...
ndArrayData = Zarr.ZarrPy.readZarr(obj.KVStoreSchema,...
start, endInds, stride);

% Store the datatype
Expand Down Expand Up @@ -363,7 +376,7 @@ function create(obj, dtype, data_size, chunk_size, fillvalue, compression)

% The Python function returns the Tensorstore schema, but we
% do not use it for anything at the moment.
obj.TensorstoreSchema = py.ZarrPy.createZarr(obj.KVStoreSchema, py.numpy.array(obj.DsetSize),...
obj.TensorstoreSchema = Zarr.ZarrPy.createZarr(obj.KVStoreSchema, py.numpy.array(obj.DsetSize),...
py.numpy.array(obj.ChunkSize), obj.Datatype.TensorstoreType, ...
obj.Datatype.ZarrType, obj.Compression, obj.FillValue);
%py.ZarrPy.temp(py.numpy.array([1, 1]), py.numpy.array([2, 2]))
Expand Down Expand Up @@ -400,7 +413,7 @@ function write(obj, data)
"Unable to write data. Size of the data to be written must match size of the array.");
end

py.ZarrPy.writeZarr(obj.KVStoreSchema, data);
Zarr.ZarrPy.writeZarr(obj.KVStoreSchema, data);
end

end
Expand Down
40 changes: 40 additions & 0 deletions test/tZarr.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
classdef tZarr < SharedZarrTestSetup
% Tests for Zarr class methods

% Copyright 2025 The MathWorks, Inc.

methods(Test)

function verifySupportedCloudPatterns(testcase)
% Verify that the bucket name and the array path can be
% extracted successfully if a cloud path is used as an input.

% This list contains path pattern currently supported by Zarr
% in MATLAB. Any invalid path not matching any of these
% patterns will result in an error.
inpPath = {'https://mybucket.s3.us-west-2.amazonaws.com/path/to/myZarrFile', ...
'https://mybucket.s3.amazonaws.com/path/to/myZarrFile', ...
'https://mybucket.s3.custom-endpoint.org/path/to/myZarrFile', ...
'https://s3.amazonaws.com/mybucket/path/to/myZarrFile', ...
'https://s3.eu-central-1.example.edu/mybucket/path/to/myZarrFile', ...
's3://mybucket/path/to/myZarrFile'};

for i = 1:length(inpPath)
[bucketName, objectPath] = Zarr.extractS3BucketNameAndPath(inpPath{i});
testcase.verifyEqual(bucketName, 'mybucket', ['Bucket name extraction failed for ' inpPath{i}]);
testcase.verifyEqual(objectPath, 'path/to/myZarrFile', ['Object path extraction failed for ' inpPath{i}]);
end
end

function verifyReload(testcase)
% Verify that calling reload method does not cause any issues

Zarr.pyReloadInProcess()
zarrPyModule = Zarr.ZarrPy;
testcase.verifyTrue(isa(zarrPyModule, 'py.module'))

end


end
end
21 changes: 0 additions & 21 deletions test/tZarrCreate.m
Original file line number Diff line number Diff line change
Expand Up @@ -62,27 +62,6 @@ function createArrayRelativePath(testcase)
testcase.verifyEqual(arrInfo.node_type,'array','Unexpected Zarr array node type');
end

function verifySupportedCloudPatterns(testcase)
% Verify that the bucket name and the array path can be
% extracted successfully if a cloud path is used as an input.

% This list contains path pattern currently supported by Zarr
% in MATLAB. Any invalid path not matching any of these
% patterns will result in an error.
inpPath = {'https://mybucket.s3.us-west-2.amazonaws.com/path/to/myZarrFile', ...
'https://mybucket.s3.amazonaws.com/path/to/myZarrFile', ...
'https://mybucket.s3.custom-endpoint.org/path/to/myZarrFile', ...
'https://s3.amazonaws.com/mybucket/path/to/myZarrFile', ...
'https://s3.eu-central-1.example.edu/mybucket/path/to/myZarrFile', ...
's3://mybucket/path/to/myZarrFile'};

for i = 1:length(inpPath)
[bucketName, objectPath] = Zarr.extractS3BucketNameAndPath(inpPath{i});
testcase.verifyEqual(bucketName, 'mybucket', ['Bucket name extraction failed for ' inpPath{i}]);
testcase.verifyEqual(objectPath, 'path/to/myZarrFile', ['Object path extraction failed for ' inpPath{i}]);
end
end

function invalidFilePath(testcase)
% Verify error when an invalid file path is used as an input to
% zarrcreate function.
Expand Down
2 changes: 1 addition & 1 deletion test/tZarrWrite.m
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ function invalidFilePath(testcase)
function dataDatatypeMismatch(testcase)
% Verify error for mismatch between datatype value and datatype
% of data to be written with zarrwrite.
errID = 'MATLAB:Python:PyExceptionWithNDArrayInfoAndMsg';
errID = 'MATLAB:Python:PyException';
zarrcreate(testcase.ArrPathWrite,testcase.ArrSize,"Datatype",'int8');
data = ones(testcase.ArrSize);
testcase.verifyError(@()zarrwrite(testcase.ArrPathWrite,data),errID);
Expand Down