-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem adapting finetuning notebook to multi-channel fluorescence imaging data #701
Comments
Hi @rodrigo-pena, Could you share with us the entire error trace from the finetuning notebook? In addition, could you attach a screenshot of the finetuning GUI widget in napari, where you initialize the parameters for finetuning? |
Hi @anwai98, here's the code cell I run:
And here's the corresponding the full error traceback:
|
Thanks for sharing the trace and the GUI screenshot. Re: notebook error trace: Could you also provide information about the paths you provide as inputs (would help me understand the nature of inputs you are trying to pass), i.e.:
Re: GUI finetuning screenshot: I think one thing you are missing is passing values to |
Here's the cell where I define
Indeed,
Regarding your second bullet point, I don't know if we can interpret the images as 3d. They have indeed 3 channels (one for the nuclei, one for the cytoplasm, and a dummy one), but each channel has very different information. So these images are different than one would get in e.g., z-stacks, which I would consider 3d data. As for the GUI screenshot, thanks for catching the missing ".tif" keys. However, I tried again (this time properly setting ".tif" under "Image data key" and "Label data key") and the error remains the same |
Okay, I can reproduce the error and I think I know what's causing this. It's the way we are fetching the multi-channel inputs, which are getting stacked together and hence causing the issue. Let's fix the issues one-by-one. For the notebook-based finetuning on your provided data: OPTION 1: We are currently using from pathlib import Path
import imageio.v3 as imageio
image_paths = sorted(glob(os.path.join(image_dir, "*.tif")))
for image_path in image_paths:
pxpath = Path(image_path)
target_path = os.path.join(pxpath.parent, "image_preprocessed_dir", pxpath.name)
image = imageio.imread(image_path)
image = image.transpose(2, 0, 1)
imageio.imwrite(target_path, image) Next, we need to update the structure in which we pass the inputs to the dataloader: image_dir = ... # NOTE: this needs to point to "image_preprocessed_dir" now
# Fetching all inputs in respective directories
image_paths = sorted(glob(os.path.join(image_dir, "*.tif")))
label_paths = sorted(glob(os.path.join(labels_dir, "*.tif")))
# To make valid splits for the inputs
n_images = len(image_paths)
n_train = int(0.85 * n_images)
# Lets's split the input paths
train_image_paths, train_label_paths = image_paths[:n_train], label_paths[:n_train]
val_image_paths, val_label_paths = image_paths[n_train:], label_paths[n_train:]
# We create our dataloaders
train_loader = torch_em.default_segmentation_loader(
raw_paths=train_image_paths,
raw_key=None,
label_paths=train_label_paths,
label_key=None,
batch_size=1,
patch_shape=(512, 512),
ndim=2,
is_seg_dataset=True,
with_channels=True,
)
val_loader = ... # same snippet as above with "val"-related paths OPTION 2: We can choose to leave the inputs as it is and use another image_dir = ... # NOTE: this needs to point to old image directory here
# Fetching all inputs in respective directories
image_paths = sorted(glob(os.path.join(image_dir, "*.tif")))
label_paths = sorted(glob(os.path.join(labels_dir, "*.tif")))
# To make valid splits for the inputs
n_images = len(image_paths)
n_train = int(0.85 * n_images)
# Lets's split the input paths
train_image_paths, train_label_paths = image_paths[:n_train], label_paths[:n_train]
val_image_paths, val_label_paths = image_paths[n_train:], label_paths[n_train:]
# We create our dataloaders
train_loader = torch_em.default_segmentation_loader(
raw_paths=train_image_paths,
raw_key=None,
label_paths=train_label_paths,
label_key=None,
batch_size=1,
patch_shape=(512, 512),
ndim=2,
is_seg_dataset=False,
n_samples=50, # This oversamples the inputs, i.e. reruns the dataset object over the inputs (once it has gone through all the images) in order to to fetch more patches
)
val_loader = ... # same snippet as above with "val"-related paths PS. I haven't tested the code written here, but in theory should work out. Let us know how it goes. |
Ah and regarding the issue with GUI-based finetuning, I see that the only checks we do in the code is to verify if the provided path exists or not. Could you double check two things: |
Thanks for the options, I'll try them tomorrow and report the results here. Does the napari plugin "Annotator 2D" use a different dataloader under the hood? After all, I was able to compute embeddings and run the pre-trained About the points to double check re: fine-tuning GUI: |
The "Annotator 2d" just takes the input path for the image and opens it up the array in a layer. It does not use the dataloader schema.
Yes, it's because of the aforementioned reason. The dataloaders are a bit specific to the finetuning schema, but with the recommended options to adapt, they should work now. (I will look into increasing support for different image types in our dataloader tutorial notebook)
Thanks for checking this out. Okay, it seems that the directories are visible to the model. Hmm, that's strange. Can you confirm if you have made the installation from source / conda-forge? |
I have created a separate mamba environment for |
Hi @anwai98 , I've tried Options 1 and 2 that you gave me above, and both seem to work in creating the dataloaders. But it is still not clear for me why in one options I have specify the flag Regardless, there are some differences in the created dataloaders: when using Moving on, when I run the following training code
I get the error traceback
I think this might be because some of my annotation masks are all zeros, meaning that no object should be segmented in that image. These all-zero masks were obtained by "skipping" the image in "Image Series Annotator" (i.e., pressing the Next Image button). If the error is because of that, does it mean that I have to manually discard from my dataset all images whose corresponding segmentation masks are all zeros? |
Thanks for confirming that the dataloaders now work.
It's the heuristic how we create and handle dataloaders in We have an introductory documentation for
My suspicion is that happens because matplotlib does not support channels-first plotting of inputs. I can reproduce the error in both cases (it's because the inputs are coming out as tensors with channels-first in either of the two recommended options), but as long as your images visually look fine, the warning is not relevant. You can check if your loader outputs from both look as expected: loader = ...
inputs = next(iter(loader))
print(inputs[0].shape) # the inputs should look like: torch.Size([1, 3, 512, 512])
Yes, exactly. The error notifies you that the provided labels do not have any valid objects to perform finetuning. You can use a sampler for this case supported by # the provided code with it's default values checks if your labels have atleast one valid object to segment
from torch_em.data.sampler import MinInstanceSampler
loader = torch_em.default_segmentation_loader(
..., # all other arguments
sampler=MinInstanceSampler(),
) Let us know if this makes the finetuning work. |
Re: warnings in training: I would say you can ignore them, they aren't anything critical. Re: error in training: Hmm, seems like we are still getting samples with no foreground objects. That's strange. Can you visually validate from the dataloader outputs if that's the case, and report if you still see images without valid paired labels (remember to use (you can visualize the loaders locally using train_loader = ...
val_loader = ...
from torch_em.util.debug import check_loader
# NOTE: I choose a large number of samples below to visualize all possible inputs
check_loader(train_loader, 50)
check_loader(val_loader, 50) Re: cluster access to internet: I have a similar setup at my end. To tackle this, what I do is briefly run the scripts on the login node (which does have access to the internet), and while I see the training proceed (after 1/2 iterations), I just stop the job and submit it to the cluster. This would be my recommendation, as it ensures all the necessary automatic downloads (i.e. model checkpoints), and then you do not need to hard-code our model paths. (+ this would also make sure to load the decoder weights) |
Following your suggestion, I've checked the data loader outputs on napari. For Option 1, I saw that initially there were some rare image patches that were cropped from areas with no instance segmentation. So I used For Option 2 (which already had There is one thing that still may be an issue: only two of the channels in the images are informative (two fluorescence wavelengths). The third is all-zeros, put there by the creators of the dataset so that maybe the images could be read as RGB. Could that be causing the issue? |
Actually, I just ran the |
Could you please double check this for Option 2 with the sampler? (I would recommend to stick with Option 2 as this works and is a bit more simpler compared to changing axes around) If it still causes issues, we might need to investigate this and fix this in In addition, could you report the output of the following: |
Good morning, @anwai98. By the way, I appreciate this level of support. It's great to see and really motivates me to use the code base y'all developed. Focusing on Option 2, I re-ran the training code today and this time it trained for 1.25 epochs before crashing with the same So once again I inspected the image/label pairs coming from the training and validation data loaders on napari through the ![]() As you can see, it seems that the segmentation mask is all-zeros. Weirdly enough, though, if I split the stack into three channels, two of the channels in the segmentation mask are all-ones: ![]() I zoomed into the extracted image patch, ![]() to search for which original image it had been extracted. I ended up finding it (red square, rotated 180 degrees): ![]() As you can see, it is indeed a patch with no annotated segmentation instances. So I wonder how the This behavior seems to me uncommon enough that sometimes the trainer will go through a whole epoch without seeing such bad pairs, as stated in the beginning. But it will still happen sometime during training, leading to a crash. |
Hi @rodrigo-pena, Thanks for getting back with the detailed feedback. Re: sampler misses to ignore empty labels for one patch: I remember we recently fixed this in |
This command returns
|
Hmm, you do have the version after we made the fix (I double-checked, and we already made the changes in (and I will also come back with a bit more theoretical feedback, eg. using dense labels for training automatic instance segmentation, the question of usage of channels in your case, etc. later once we take care of the above technical issue) |
I tried to reproduce the mentioned effect on sparse labels and I have a few suspicions which might potentially get rid of this issue (in short: there might be tiny pixels getting considered as valid objects, which are passed to the model and might be somehow causing the issue). Could you try again by adding an additional argument to the sampler we pass to the loader: NOTE: Remember to use the sampler in both training and validation loaders! EDIT: I forgot to mention something. For this, you need to install
|
Ok, that did the trick. Training has finished and the resulting AIS predictions look promising: However, during the process of re-installing some packages, something broke in
After these 3 steps, whenever I called |
Yayy, that's great to hear that the training works now. Re: broken
This should not be responsible for breaking an existing dependency, because the idea of using |
And a few more fundamental points to share (which in theory are good to work with
PS. We are happy to have all the feedback from you. We have a better idea now of the documentation and notebooks improvements we want to make to improve user experience. Thanks for your patience and interest in Let us know if there's something else you would like to discuss. |
For a summary, then, the solution that seems to work requires:
I just wanted to make sure I understand the concept of "dense annotations": does this mean annotating everything that should be segmented in the image, or does that mean having the segmented instances occupy a fraction of the image similar to the background? Where do you plan to ad the improvements to the documentation? on the example finetuning notebooks of |
Thanks for the nice summary. I'll leave a few mentions:
That's correct. To add to this, we are planning to make a release rather soon which would merge point
This is the reason we use And Re: dataloaders:
This should match the value to the minimum number of pixels an object should have to be considered as valid for finetuning. Also, this should match with the
I would like to broaden the dataloader tutorials in EDIT:
Yes. I would recommend to fully annotate desired objects as much as possible (ideally all objects) to get the best outcome with our automatic instance segmentation method. |
Hi @anwai98,
This does not seem to be the case. I just tried running my training code on the full dataset, including the images with blank label pairs and I get the following error:
Then, when I train only with the "valid" image-label pairs everything goes smoothly. It seems to me that the |
Hi @rodrigo-pena,
The reported issue originates when the sampler cannot find valid samples after "n" attempts. This can be fixed by increasing the sampling attempts for fetching a valid sample after creating the dataloaders as follows:
Let us know if this works for you. |
Training seems to be working now for the full dataset after setting |
Happy to hear that the training works now.
My guess is when there are no empty labelled sample-pairs, the sampler might be getting a valid sample in under 500 attempts. |
I'm also working with RGB data, but I'm running into a slightly different problem. I have a folder with RGB images in .tif format (reshaped so that the RGB channel is first). I have set up the data loader using both option 1 and 2. Here is my version of option 2:
When I run the training script, I'm getting the following error:
When I check the shape of the data loader output, I'm getting 'torch.Size([1, 3, 512, 512])' and 'torch.Size([1, 1, 512, 512])' for the image and label respectively. I've also checked several times to see if it is pulling any labels that have no object, and it always has an object. |
@rodrigo-pena : thanks for all your feedback here. This helped us understand corner cases of the training with empty label images better and we will incorporate this into the next release to improve handling of these cases. @jalexs82 : I have created a new issue #716 in order to keep the discussion of your problem separate. This makes it easier to keep the discussion focused. |
I’m working on fine-tuning micro-sam on some fluorescence imaging of myotubes. For prototyping my pipeline, I’ve downloaded the image data associated with the MyoCount paper [Murphy et al. 2019]. These images are .tif files of shape
(1040, 1392, 3)
, with one channel for nulclei, one for cytoplasm and one dummy channel full of zeros.Next, I hand-annotated those images myself using micro-sam’s napari plugin “Annotator 2D”. The saved annotation masks are single-channel masks of shape
(1040, 1392)
.I then proceeded to begin adapting the sample fine-tuning notebook to make the
vit_l_lm
model fit my desired annotations better. However, when it came to defining the data loaders, I reached an error which I cannot solve. It seems that under the hoodtorch_em.default_segmentation_loader
is trying to assert if the images and annotation masks are the same size (they are not, as explained above):Since
torch_em.default_segmentation_loader
loads the images and labels directly from their directory, is the solution to create two dummy channels on the annotation masks so that they are the same size as their respective images? Are there other issues I should expect when working with multi-channel fluorescence imaging (as opposed to single channel phase contrast or EM)?As a side note, I tried also fine-tuning via the provided napari plugin, but I stumble upon a different error:
“The path to the raw data is missing or does not exist. The path to label data is missing or does not exist.”
I wonder what the actual underlying error is, because the path obviously exists, as I can load and display the images and annotation masks in the fine-tuning notebook.
The text was updated successfully, but these errors were encountered: