You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
dataset -- The main folder either Train and Test folder (default = Training)
23
+
24
+
=================================================
25
+
Example:
26
+
train_folders = eda.folders(train_dataset)
27
+
test_folders = eda.folders(test_dataset)
28
+
'''
29
+
#sets the path to the Testing and training folders
17
30
path=f'../Images/{dataset}'
18
-
images_dir= [xforxinos.listdir(path)]
31
+
#lists the classification folders within the dataset
19
32
folders=os.listdir(path)
20
33
returnfolders
21
34
22
-
defimage_len(dataset, folders):
35
+
36
+
'''
37
+
38
+
The concept of this code was learned from General Assembly's Data science Immersive's Excel lab excercise during the 2023-03 to 2023-06 cohort.
39
+
The concepts has been adapted to identify file paths ways to work with the data
40
+
'''
41
+
defimage_len(folders, dataset='Training'):
23
42
'''
24
-
This code takes in the list directory of the folder containing the classification folders. And the dataset.
25
-
this code was heavily inspired by this project: https://github.com/DerikVo/DSI_project_4_plant_disease/blob/main/notebooks/01_Potato_PlantVillageEDA.ipynb
26
-
Has since been adapted to work with a jupyter notebook
27
-
TODO:convert all image eda into a class/method script
43
+
44
+
List subfolders with the main folder containing the classification folders for each image set. As well as shows a random image from the classification
#shows the image for the classification for reference
43
69
plt.title(f'{image_name}')
44
70
plt.imshow(image)
45
71
plt.axis('off')
46
72
plt.show()
47
-
classImage:
48
-
def__init__(self, dataset, sub_folder):
49
-
#learned i didnt need a comma because that creates a tuple: https://stackoverflow.com/questions/39192261/class-init-takes-parameters-but-turns-them-into-tuples-for-some-reason
50
-
self.dataset=dataset
51
-
self.sub_folder=sub_folder
52
-
53
-
defavg_images(self):
54
-
'''
55
-
This function takes two arguments the dataset: training or testing, and the sub_folder for the type of tumor e.g. ['glioma', 'meningioma', 'notumor', 'pituitary']
56
-
This function is used to find the average pixel values of each class
57
-
The purpose is to find if there is a difference in each class
58
-
'''
59
-
#assign the path in the function for readability and understanding
60
-
#assign the sub folder (class name) that was passed to the function
61
-
path= (f'../Images/{self.dataset}')
62
-
class_name=self.sub_folder
63
-
batch_size=32# Modify this to suit your needs
64
-
#instantiate ImageDataGenerator
65
-
datagen=ImageDataGenerator(rescale=1./255) # normalize pixel values to [0,1]
66
-
#get the images from the directory
67
-
generator=datagen.flow_from_directory(path,
68
-
classes=[class_name],
69
-
class_mode=None,
70
-
color_mode='grayscale',
71
-
target_size=(256, 256),
72
-
batch_size=batch_size)
73
-
n_samples=generator.samples
74
-
average_image=np.zeros((256, 256, 1))
73
+
'''
74
+
This portion uses code from a previous project from this [notebook](https://github.com/DerikVo/DSI_project_4_plant_disease/blob/main/notebooks/01_Potato_PlantVillageEDA.ipynb).
75
+
The code was originally developed by chat GPT 4 with the prompt: "I have an image data set that I want to do EDA on. How can I average out the pixel values of all the images in a class. python keras."
75
76
76
-
foriinrange(n_samples//batch_size): # Integer division to avoid partial batches
77
-
images=next(generator)
78
-
average_image+=np.sum(images, axis=0)
77
+
This function takes two arguments the dataset: training or testing, and the sub_folder for the type of tumor e.g. ['glioma', 'meningioma', 'notumor', 'pituitary']
78
+
This function is used to find the average pixel values of each class
79
+
The purpose is to find if there is a difference in each class
79
80
80
-
average_image/=n_samples
81
-
returnaverage_image
81
+
'''
82
+
defavg_images(folders, dataset='Training'):
83
+
'''
84
+
85
+
This function is used to find the average pixel value of each class
82
86
87
+
Users will need to assign the images to a variable.
folders -- The sub folder containing the classifcation for tumor type ( 'glioma', 'meningioma', 'notumor', 'pituitary' )
93
+
dataset -- The main folder either Train and Test folder (default = Training)
94
+
'''
95
+
#sets the path to the Testing and training folders
96
+
path= (f'../Images/{dataset}')
97
+
98
+
class_name=folders
99
+
batch_size=32# Modify this to suit your needs
100
+
#instantiate ImageDataGenerator
101
+
datagen=ImageDataGenerator(rescale=1./255) # normalize pixel values to [0,1]
102
+
#get the images from the directory
103
+
generator=datagen.flow_from_directory(path,
104
+
classes=[class_name],
105
+
class_mode=None,
106
+
color_mode='grayscale',
107
+
target_size=(256, 256),
108
+
batch_size=batch_size)
109
+
n_samples=generator.samples
110
+
average_image=np.zeros((256, 256, 1))
111
+
112
+
foriinrange(n_samples//batch_size): # Integer division to avoid partial batches
113
+
images=next(generator)
114
+
average_image+=np.sum(images, axis=0)
115
+
116
+
average_image/=n_samples
117
+
returnaverage_image
118
+
119
+
'''
120
+
This portion uses code from a previous project from this [notebook](https://github.com/DerikVo/DSI_project_4_plant_disease/blob/main/notebooks/01_Potato_PlantVillageEDA.ipynb). The concept was originally developed by [Yasser Siddiqui]([email protected]) and has been adapted to use with this notebook.
121
+
122
+
This function is used to find the differences of the average pixel value between each class 'glioma', 'meningioma', and 'pituitary' compared to 'notumor'.
123
+
These different characteristics can help us understand how the classes are unique when compared to not having a tumor.
124
+
If there are significant differences we can better interpret our model.
125
+
'''
83
126
defimage_contrast(comparision, base_image):
127
+
'''
128
+
This function finds the differences between the pixel averages of two classes to identify how the model can differentiate classes
129
+
Users will need to have ran the avg_images function for each class before running the image_contrats function.
130
+
131
+
Users will need to assign the images to a variable.
0 commit comments