@@ -28,6 +28,7 @@ tensorflow 2's ``tf.data.Dataset``.
2828It provides a simple solution to oversampling / stratification, weighted
2929sampling, and finally converting to a ``torch.utils.data.DataLoader ``.
3030
31+
3132Install
3233=======
3334
@@ -41,6 +42,7 @@ Or, for the old-timers:
4142
4243 pip install pytorch-datastream
4344
45+
4446 Usage
4547=====
4648
@@ -72,6 +74,45 @@ a more extensive list on API and usage.
7274 .state_dict
7375 .load_state_dict
7476
77+
78+ Simple image dataset example
79+ ----------------------------
80+ Here's a basic example of loading images from a directory:
81+
82+ .. code-block :: python
83+
84+ from datastream import Dataset
85+ from pathlib import Path
86+ from PIL import Image
87+
88+ # Assuming images are in a directory structure like:
89+ # tests/images/
90+ # class1/
91+ # image1.jpg
92+ # image2.jpg
93+ # class2/
94+ # image3.jpg
95+ # image4.jpg
96+
97+ image_dir = Path(" images" )
98+ image_paths = list (image_dir.glob(" **/*.jpg" ))
99+
100+ dataset = (
101+ Dataset.from_paths(image_paths, pattern = r " . * /( ?P<class_name> \w + ) /( ?P<image_name> \w + ) . jpg" )
102+ .map(lambda row : dict (
103+ image = Image.open(row[" path" ]),
104+ class_name = row[" class_name" ],
105+ image_name = row[" image_name" ],
106+ ))
107+ )
108+
109+ # Access an item from the dataset
110+ first_item = dataset[0 ]
111+ print (f " Class: { first_item[' class_name' ]} , Image name: { first_item[' image_name' ]} " )
112+
113+ This example demonstrates how to create a dataset from image files, extracting class names and image names from the file paths, and loading the images when accessed.
114+
115+
75116Merge / stratify / oversample datastreams
76117-----------------------------------------
77118The fruit datastreams given below repeatedly yields the string of its fruit
87128 >> > next (iter (datastream.data_loader(batch_size = 8 )))
88129 [' apple' , ' apple' , ' pear' , ' banana' , ' apple' , ' apple' , ' pear' , ' banana' ]
89130
131+
90132 Zip independently sampled datastreams
91133-------------------------------------
92134The fruit datastreams given below repeatedly yields the string of its fruit
@@ -101,12 +143,8 @@ type.
101143 >> > next (iter (datastream.data_loader(batch_size = 4 )))
102144 [(' apple' , ' pear' ), (' apple' , ' banana' ), (' apple' , ' pear' ), (' apple' , ' banana' )]
103145
146+
104147 More usage examples
105148-------------------
106149See the `documentation <https://pytorch-datastream.readthedocs.io/en/latest/ >`_
107150for more usage examples.
108-
109- Install from source
110- ===================
111-
112- .. pip install -e .
0 commit comments