General feedback after teaching #21

tiagosousagarcia · 2024-02-21T10:50:25Z

We've run a workshop with this lesson yesterday at Newcastle University, and I thought it would be a good idea to leave some feedback from our experiences.

First of all, many thanks for all your hard work in the lesson, it is a great resource and it proved very popular with learners. It goes without saying but what follows is meant to be helpful feedback, rather than harsh criticism. With that in mind, here are some of the things that I've noticed:

Timings should be more generous -- although the amount of code is relatively small and problem-free, we had lots of questions from the audience from the very start. On average, I think each chapter took about one hour, with the exception of the very first one for which 30 minutes was adequate;
In general, more exercise suggestions would be useful, at least one per episode;
Instructors might need a bit more support in the later episodes, in particular episode 4 about the reasons for implementing that particular model architecture (i.e., why those numbers of layers, why those kernel sizes, is this a starting point to be improved, borrowed from a popular model, completely original...)
In general, places where models can be tweaked to improve (or simply alter) their performance could be signposted better -- that was a common question from learners;
Although the term 'normalization' is used in episode 2 (under image pre-processing) and is generally used for this stage in ML, it was pointed out during the lesson that the actual operation being performed is 'standardization', so a note could be added about that distinction; some further explanation on why we downsample and normalise image data (and what effects they could have on model performance) could also be helpful for instructors;
Reusing variable names should be avoided, particularly when transforming the structure or shape of the dataset. An eagle-eyed helper spotted a particularly problematic spot, when in episode 4 we add a new dimension to the dataset

dataset = dataset[..., np.newaxis]
labels = labels[..., np.newaxis]

If that cell needs to be re-run for any reason, the resulting dataset will get more and more dimensions, which might not be picked up until much later in the lesson (in our case, during model training)

Finally, a couple of specific problems/questions with code:

link to data downloads a corrupted file (but you already know that chest_xrays.zip cannot be downloaded from the setup page #8 )
In the data augmentation section: shouldn't the options for ImageDataGenerator have different values? I.e., having everything at zero will simply create duplicates of existing images, no?
In the data augmentation section: shouldn't this augmented_images = [val_generator[0][0][0] for i in range(batch_size)] be augmented_images = [val_generator[i][0][0] for i in range(batch_size)] otherwise we plot 5 equal images.
In the model architecture section there's a discrepancy between the comment explanation and the function used:

# Global max pooling reduces dimensions back to the input size
x = GlobalAveragePooling2D()(x)

That's it! Hope this is useful, and thanks again for your great work!

The text was updated successfully, but these errors were encountered:

tompollard · 2025-05-21T17:36:49Z

Thanks for this very helpful feedback @tiagosousagarcia (and I'm sorry that it has taken so long to respond). I'm working on addressing the points that you have raised, and will post updates in this thread.

tompollard · 2025-05-21T19:03:48Z

Some quick updates:

In general, more exercise suggestions would be useful, at least one per episode;

I have added several exercises (e.g. in #35). More to come.

Although the term 'normalization' is used in episode 2 (under image pre-processing) and is generally used for this stage in ML, it was pointed out during the lesson that the actual operation being performed is 'standardization', so a note could be added about that distinction; some further explanation on why we downsample and normalise image data (and what effects they could have on model performance) could also be helpful for instructors;

Good catch! I have replaced "normalization" with "standardization" and added some explanatory text. I'll add more on this later.

In the data augmentation section: shouldn't the options for ImageDataGenerator have different values? I.e., having everything at zero will simply create duplicates of existing images, no?

I have added an exercise to make it clear that the participants are expected to tweak the augmentation parameters themselves.

In the data augmentation section: shouldn't this
augmented_images = [val_generator[0][0][0] for i in range(batch_size)]
be
augmented_images = [val_generator[i][0][0] for i in range(batch_size)]
otherwise we plot 5 equal images.

It looks strange, but I think the current code is correct. We want to call the val_generator on the same image each time, so we can see the effect on that image (rather than plotting the effect of val_generator on 5 different images.

In the model architecture section there's a discrepancy between the comment explanation and the function used:
# Global max pooling reduces dimensions back to the input size
x = GlobalAveragePooling2D()(x)

Good catch, now fixed.

Auto-generated via `{sandpaper}` Source : f7d0ad6 Branch : main Author : Tom Pollard <[email protected]> Time : 2025-05-21 19:36:06 +0000 Message : tidying. avoid reuse of variable name. ref #21.

Auto-generated via `{sandpaper}` Source : a82d0c4 Branch : md-outputs Author : GitHub Actions <[email protected]> Time : 2025-05-21 19:36:57 +0000 Message : markdown source builds Auto-generated via `{sandpaper}` Source : f7d0ad6 Branch : main Author : Tom Pollard <[email protected]> Time : 2025-05-21 19:36:06 +0000 Message : tidying. avoid reuse of variable name. ref #21.

Auto-generated via `{sandpaper}` Source : 912f92b Branch : main Author : Tom Pollard <[email protected]> Time : 2025-05-21 19:38:27 +0000 Message : More generous timing. Ref #21

Auto-generated via `{sandpaper}` Source : 8cc57f6 Branch : md-outputs Author : GitHub Actions <[email protected]> Time : 2025-05-21 19:39:20 +0000 Message : markdown source builds Auto-generated via `{sandpaper}` Source : 912f92b Branch : main Author : Tom Pollard <[email protected]> Time : 2025-05-21 19:38:27 +0000 Message : More generous timing. Ref #21

tompollard · 2025-05-21T19:43:05Z

link to data downloads a corrupted file (but you already know that #8 )

This was fixed in #32, though I have kept an issue open (#31) because I think there is a better solution than the one implemented.

Timings should be more generous -- although the amount of code is relatively small and problem-free, we had lots of questions from the audience from the very start. On average, I think each chapter took about one hour, with the exception of the very first one for which 30 minutes was adequate;

Good point, I think I'd just left the default times in the template. I have updated the sections to allow for more time (now mostly 40min teaching, 20min exercises).

Reusing variable names should be avoided, particularly when transforming the structure or shape of the dataset. An eagle-eyed helper spotted a particularly problematic spot, when in episode 4 we add a new dimension to the dataset
dataset = dataset[..., np.newaxis]
labels = labels[..., np.newaxis]
If that cell needs to be re-run for any reason, the resulting dataset will get more and more dimensions, which might not be picked up until much later in the lesson (in our case, during model training)

I've been caught by this before! Fixed in f7d0ad6

Auto-generated via `{sandpaper}` Source : 5fa007b Branch : main Author : Tom Pollard <[email protected]> Time : 2025-05-21 20:09:46 +0000 Message : Add explanation for architecture. Ref #21.

Auto-generated via `{sandpaper}` Source : 408366b Branch : md-outputs Author : GitHub Actions <[email protected]> Time : 2025-05-21 20:10:50 +0000 Message : markdown source builds Auto-generated via `{sandpaper}` Source : 5fa007b Branch : main Author : Tom Pollard <[email protected]> Time : 2025-05-21 20:09:46 +0000 Message : Add explanation for architecture. Ref #21.

…performance. Ref #21.

Auto-generated via `{sandpaper}` Source : 8575443 Branch : main Author : Tom Pollard <[email protected]> Time : 2025-05-21 21:05:43 +0000 Message : Add guidance on wher models can be tweaked to improve or alter their performance. Ref #21.

Auto-generated via `{sandpaper}` Source : a6c8460 Branch : md-outputs Author : GitHub Actions <[email protected]> Time : 2025-05-21 21:06:28 +0000 Message : markdown source builds Auto-generated via `{sandpaper}` Source : 8575443 Branch : main Author : Tom Pollard <[email protected]> Time : 2025-05-21 21:05:43 +0000 Message : Add guidance on wher models can be tweaked to improve or alter their performance. Ref #21.

tompollard · 2025-05-21T22:13:02Z

@tiagosousagarcia Thanks again for this feedback - extremely helpful! I think I have addressed most of your points. If you have an opportunity, please could you take a look at the updated materials and let me know if further work is needed?

tompollard added a commit that referenced this issue May 21, 2025

tidying. avoid reuse of variable name. ref #21.

f7d0ad6

tompollard added a commit that referenced this issue May 21, 2025

More generous timing. Ref #21

912f92b

github-actions bot pushed a commit that referenced this issue May 21, 2025

markdown source builds

8cc57f6

Auto-generated via `{sandpaper}` Source : 912f92b Branch : main Author : Tom Pollard <[email protected]> Time : 2025-05-21 19:38:27 +0000 Message : More generous timing. Ref #21

tompollard added a commit that referenced this issue May 21, 2025

Add explanation for architecture. Ref #21.

5fa007b

tompollard added a commit that referenced this issue May 21, 2025

Add guidance on wher models can be tweaked to improve or alter their …

8575443

…performance. Ref #21.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

General feedback after teaching #21

General feedback after teaching #21

tiagosousagarcia commented Feb 21, 2024

tompollard commented May 21, 2025

Uh oh!

tompollard commented May 21, 2025

Uh oh!

tompollard commented May 21, 2025

Uh oh!

tompollard commented May 21, 2025

Uh oh!

Uh oh!

General feedback after teaching #21

General feedback after teaching #21

Comments

tiagosousagarcia commented Feb 21, 2024

tompollard commented May 21, 2025

Uh oh!

tompollard commented May 21, 2025

Uh oh!

tompollard commented May 21, 2025

Uh oh!

tompollard commented May 21, 2025

Uh oh!