Skip to content

Commit 2ab4910

Browse files
committed
V0.0
0 parents  commit 2ab4910

30 files changed

+6609
-0
lines changed

.DS_Store

6 KB
Binary file not shown.

README.md

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Pytorch-UNet-Flood-Segmentation
2+
3+
This repository contains a PyTorch implementation of a U-Net model for segmenting water areas (flood and permanent water) in Sentinel-1 satellite images. The U-Net architecture is a popular choice for image segmentation tasks, particularly for biomedical and satellite imagery.
4+
5+
The model has been trained and evaluated on Cloud to Street - Microsoft flood dataset and it is composed of Sentinel-1 image chips with corresponding water labels.
6+
7+
### Repository Structure
8+
The repository is structured as follows:
9+
10+
* data/: contains datasets for model training
11+
* models/: contains the trained U-Net models
12+
* predictions/: contains the predictions of trained U-Net models
13+
* runs/ : contains tensorboard log files
14+
* make-dataset.py : the script extract and perform tiling (256*256) on sentinel-1 VV VH and water labels chips
15+
* gee-dem-data.py : the script to download the SRTM DEM tiles from Google Earth Engine
16+
* gee-pwater-data.py : the script to download the JRC permanent water tiles from Google Earth Engine
17+
* helpers.py : helper functions for gee-dem-data.py, gee-pwater-data.py and make-dataset.py
18+
* make-features.ipynb : the jupyter notebook to arrange data before model training
19+
* models.py : U-Net model
20+
* model-train.ipynb : the jupyter notebook to train the models
21+
* model-inference.ipynb : the jupyter notebook to inference the models on the test set
22+
* requirements.txt: lists the required Python packages to run the scripts
23+
24+
### Preparation of data
25+
#### Step 1
26+
* Download Sentinel-1 data for the Cloud to Street - Microsoft flood dataset (*c2smsfloods_v1_source_s1.tar.gz*) and Sentinel-1 water labels for the Cloud to Street - Microsoft flood dataset (*c2smsfloods_v1_labels_s1_water.tar.gz*) from https://mlhub.earth/data/c2smsfloods_v1.
27+
28+
#### Step 2
29+
* Run the following scripts
30+
```bash
31+
python make-dataset.py --proj_dir path/to/project-directory --chips path/to/c2smsfloods_v1_source_s1.tar.gz --labels path/to/c2smsfloods_v1_labels_s1_water.tar.gz
32+
```
33+
* make-dataset.py script will create data , data/chips/VV, data/chips/VH, and data/labels directories. (All the data is in the dimention of 256*256 with a resolution of 10 m)
34+
35+
* Google Earth Engine cloud project id is needed to run the below scripts.
36+
```bash
37+
python gee-dem-data.py --cld_projid ee-xxxxxxx --in_dir data/chips/VV --out_dir data
38+
```
39+
```bash
40+
python gee-pwater-data.py --cld_projid ee-xxxxxxx --in_dir data/chips/VV --out_dir data
41+
```
42+
* gee-dem-data.py and gee-pwater-data.py scripts will create data/dem and data/pwater directories respectively.
43+
44+
### Model information
45+
First the whole dataset was divided into two parts based on the water percentage (>=30% and <30%) of the Sentinel-1 water label chips. Then two U-Net models have been trained on those two datasets and final prediction was taken by combining the predictions of both models.
46+
47+
The combined prediction achieves an average Intersection over Union (IoU) of 0.877 on the test set.
48+
49+
<p align="center">
50+
<img src="predictions/preds.gif" width="500" height="250" />
51+
</p>
52+
53+
#### Acknowledgments
54+
The implementation is based on the original U-Net paper: https://arxiv.org/abs/1505.04597.
55+
56+
This is a basic workflow of U-Net segmentation and the models require more improvements.

data/chip_info.csv

Lines changed: 3601 additions & 0 deletions
Large diffs are not rendered by default.

data/chip_set1.csv

Lines changed: 446 additions & 0 deletions
Large diffs are not rendered by default.

data/chip_set2.csv

Lines changed: 1132 additions & 0 deletions
Large diffs are not rendered by default.

data/chip_test.csv

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
vv,vh,dem,pwater,label,water_percent
2+
data\chips\VV\542_10.tif,data\chips\VH\542_10.tif,data\dem\542_10.tif,data\pwater\542_10.tif,data\labels\542_10.tif,1.271057129
3+
data\chips\VV\767_10.tif,data\chips\VH\767_10.tif,data\dem\767_10.tif,data\pwater\767_10.tif,data\labels\767_10.tif,2.137756348
4+
data\chips\VV\212_10.tif,data\chips\VH\212_10.tif,data\dem\212_10.tif,data\pwater\212_10.tif,data\labels\212_10.tif,6.518554688
5+
data\chips\VV\628_11.tif,data\chips\VH\628_11.tif,data\dem\628_11.tif,data\pwater\628_11.tif,data\labels\628_11.tif,6.608581543
6+
data\chips\VV\393_00.tif,data\chips\VH\393_00.tif,data\dem\393_00.tif,data\pwater\393_00.tif,data\labels\393_00.tif,10.46600342
7+
data\chips\VV\255_11.tif,data\chips\VH\255_11.tif,data\dem\255_11.tif,data\pwater\255_11.tif,data\labels\255_11.tif,12.84484863
8+
data\chips\VV\023_11.tif,data\chips\VH\023_11.tif,data\dem\023_11.tif,data\pwater\023_11.tif,data\labels\023_11.tif,15.48614502
9+
data\chips\VV\050_01.tif,data\chips\VH\050_01.tif,data\dem\050_01.tif,data\pwater\050_01.tif,data\labels\050_01.tif,18.87969971
10+
data\chips\VV\586_01.tif,data\chips\VH\586_01.tif,data\dem\586_01.tif,data\pwater\586_01.tif,data\labels\586_01.tif,19.18792725
11+
data\chips\VV\172_10.tif,data\chips\VH\172_10.tif,data\dem\172_10.tif,data\pwater\172_10.tif,data\labels\172_10.tif,29.30755615
12+
data\chips\VV\317_10.tif,data\chips\VH\317_10.tif,data\dem\317_10.tif,data\pwater\317_10.tif,data\labels\317_10.tif,30.6060791
13+
data\chips\VV\216_00.tif,data\chips\VH\216_00.tif,data\dem\216_00.tif,data\pwater\216_00.tif,data\labels\216_00.tif,30.63812256
14+
data\chips\VV\003_10.tif,data\chips\VH\003_10.tif,data\dem\003_10.tif,data\pwater\003_10.tif,data\labels\003_10.tif,38.21105957
15+
data\chips\VV\049_10.tif,data\chips\VH\049_10.tif,data\dem\049_10.tif,data\pwater\049_10.tif,data\labels\049_10.tif,45.8984375
16+
data\chips\VV\733_10.tif,data\chips\VH\733_10.tif,data\dem\733_10.tif,data\pwater\733_10.tif,data\labels\733_10.tif,46.51947021
17+
data\chips\VV\045_01.tif,data\chips\VH\045_01.tif,data\dem\045_01.tif,data\pwater\045_01.tif,data\labels\045_01.tif,48.24523926
18+
data\chips\VV\622_10.tif,data\chips\VH\622_10.tif,data\dem\622_10.tif,data\pwater\622_10.tif,data\labels\622_10.tif,49.96948242
19+
data\chips\VV\668_10.tif,data\chips\VH\668_10.tif,data\dem\668_10.tif,data\pwater\668_10.tif,data\labels\668_10.tif,51.18560791
20+
data\chips\VV\776_10.tif,data\chips\VH\776_10.tif,data\dem\776_10.tif,data\pwater\776_10.tif,data\labels\776_10.tif,53.50341797
21+
data\chips\VV\100_00.tif,data\chips\VH\100_00.tif,data\dem\100_00.tif,data\pwater\100_00.tif,data\labels\100_00.tif,54.81262207
22+
data\chips\VV\391_11.tif,data\chips\VH\391_11.tif,data\dem\391_11.tif,data\pwater\391_11.tif,data\labels\391_11.tif,62.42370605
23+
data\chips\VV\161_01.tif,data\chips\VH\161_01.tif,data\dem\161_01.tif,data\pwater\161_01.tif,data\labels\161_01.tif,63.17443848
24+
data\chips\VV\380_10.tif,data\chips\VH\380_10.tif,data\dem\380_10.tif,data\pwater\380_10.tif,data\labels\380_10.tif,65.66467285
25+
data\chips\VV\016_00.tif,data\chips\VH\016_00.tif,data\dem\016_00.tif,data\pwater\016_00.tif,data\labels\016_00.tif,67.06390381
26+
data\chips\VV\019_00.tif,data\chips\VH\019_00.tif,data\dem\019_00.tif,data\pwater\019_00.tif,data\labels\019_00.tif,67.14782715
27+
data\chips\VV\634_11.tif,data\chips\VH\634_11.tif,data\dem\634_11.tif,data\pwater\634_11.tif,data\labels\634_11.tif,67.67120361
28+
data\chips\VV\795_01.tif,data\chips\VH\795_01.tif,data\dem\795_01.tif,data\pwater\795_01.tif,data\labels\795_01.tif,70.97473145
29+
data\chips\VV\161_11.tif,data\chips\VH\161_11.tif,data\dem\161_11.tif,data\pwater\161_11.tif,data\labels\161_11.tif,73.01330566
30+
data\chips\VV\822_00.tif,data\chips\VH\822_00.tif,data\dem\822_00.tif,data\pwater\822_00.tif,data\labels\822_00.tif,81.56280518
31+
data\chips\VV\126_01.tif,data\chips\VH\126_01.tif,data\dem\126_01.tif,data\pwater\126_01.tif,data\labels\126_01.tif,81.84356689
32+
data\chips\VV\820_11.tif,data\chips\VH\820_11.tif,data\dem\820_11.tif,data\pwater\820_11.tif,data\labels\820_11.tif,82.4005127
33+
data\chips\VV\832_10.tif,data\chips\VH\832_10.tif,data\dem\832_10.tif,data\pwater\832_10.tif,data\labels\832_10.tif,85.61706543
34+
data\chips\VV\165_11.tif,data\chips\VH\165_11.tif,data\dem\165_11.tif,data\pwater\165_11.tif,data\labels\165_11.tif,88.02947998
35+
data\chips\VV\621_11.tif,data\chips\VH\621_11.tif,data\dem\621_11.tif,data\pwater\621_11.tif,data\labels\621_11.tif,91.5222168
36+
data\chips\VV\219_10.tif,data\chips\VH\219_10.tif,data\dem\219_10.tif,data\pwater\219_10.tif,data\labels\219_10.tif,92.97943115
37+
data\chips\VV\824_11.tif,data\chips\VH\824_11.tif,data\dem\824_11.tif,data\pwater\824_11.tif,data\labels\824_11.tif,96.04492188
38+
data\chips\VV\171_00.tif,data\chips\VH\171_00.tif,data\dem\171_00.tif,data\pwater\171_00.tif,data\labels\171_00.tif,96.85516357
39+
data\chips\VV\657_01.tif,data\chips\VH\657_01.tif,data\dem\657_01.tif,data\pwater\657_01.tif,data\labels\657_01.tif,99.83520508

gee-dem-data.py

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
import argparse
2+
import os
3+
import glob
4+
import concurrent.futures
5+
from google.auth.transport.requests import AuthorizedSession
6+
import ee
7+
import restee as ree
8+
from tqdm import tqdm
9+
from helpers import restgee_data
10+
11+
def main(args):
12+
13+
#DEM dir creation
14+
os.makedirs(os.path.join(args.out_dir,'dem'), exist_ok=True)
15+
16+
ee.Authenticate(auth_mode='notebook')
17+
session = AuthorizedSession(ee.data.get_persistent_credentials())
18+
19+
# Creating a restee session
20+
class EESessionContainer(ree.EESession):
21+
def __init__(self, project, session):
22+
self._PROJECT = project
23+
self._SESSION = session
24+
25+
# Create an EESesssion object with the correct permissions
26+
ee_session = EESessionContainer(args.cld_projid, session)
27+
28+
# Authenticate EE with the session credentials
29+
ee.Initialize(ee_session.session.credentials, project=args.cld_projid)
30+
31+
# SRTM dem data
32+
elevation = ee.Image('NASA/NASADEM_HGT/001').select('elevation')
33+
34+
# One of 2 polarization
35+
chip_list = sorted(glob.glob(os.path.join(args.in_dir,'*.tif'), recursive = True))
36+
with tqdm(total=len(chip_list),position=0, leave=True, desc="GEE data request progress: SRTM DEM") as pbar:
37+
with concurrent.futures.ThreadPoolExecutor(max_workers=15) as executor:
38+
# Start the load operations and mark each future with its chip
39+
future_to_chip = {executor.submit(restgee_data,chip,elevation, 'elevation', os.path.join(args.out_dir,'dem'),ee_session): chip for chip in chip_list}
40+
for future in concurrent.futures.as_completed(future_to_chip):
41+
chip = future_to_chip[future]
42+
pbar.update(n=1)
43+
44+
45+
if __name__ == "__main__":
46+
parser = argparse.ArgumentParser(description='Downloading corresponding SRTM DEM tile from GEE for a given raster chip')
47+
parser.add_argument(
48+
"--cld_projid",
49+
required=True,
50+
type=str,
51+
help="Cloud project id",
52+
)
53+
parser.add_argument(
54+
"--in_dir",
55+
type=str,
56+
required=True,
57+
help="Input tile directory",
58+
)
59+
parser.add_argument(
60+
"--out_dir",
61+
type=str,
62+
required=True,
63+
help="Output folder for corresponding dem tiles",
64+
)
65+
args = parser.parse_args()
66+
67+
main(args)

gee-pwater-data.py

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
import argparse
2+
import os
3+
import glob
4+
import concurrent.futures
5+
from google.auth.transport.requests import AuthorizedSession
6+
import ee
7+
import restee as ree
8+
from tqdm import tqdm
9+
from helpers import restgee_data
10+
11+
def main(args):
12+
13+
#DEM dir creation
14+
os.makedirs(os.path.join(args.out_dir,'pwater'), exist_ok=True)
15+
16+
ee.Authenticate(auth_mode='notebook')
17+
session = AuthorizedSession(ee.data.get_persistent_credentials())
18+
19+
# Creating a restee session
20+
class EESessionContainer(ree.EESession):
21+
def __init__(self, project, session):
22+
self._PROJECT = project
23+
self._SESSION = session
24+
25+
# Create an EESesssion object with the correct permissions
26+
ee_session = EESessionContainer(args.cld_projid, session)
27+
28+
# Authenticate EE with the session credentials
29+
ee.Initialize(ee_session.session.credentials, project=args.cld_projid)
30+
31+
# JRC water data
32+
occurrence = ee.Image('JRC/GSW1_4/GlobalSurfaceWater').select('occurrence')
33+
34+
35+
36+
# One of 2 polarization
37+
chip_list = sorted(glob.glob(os.path.join(args.in_dir,'*.tif'), recursive = True))
38+
with tqdm(total=len(chip_list),position=0, leave=True, desc="GEE data request progress: JRC Water") as pbar:
39+
with concurrent.futures.ThreadPoolExecutor(max_workers=15) as executor:
40+
# Start the load operations and mark each future with its chip
41+
future_to_chip = {executor.submit(restgee_data,chip,occurrence, 'occurrence', os.path.join(args.out_dir,'pwater'),ee_session): chip for chip in chip_list}
42+
for future in concurrent.futures.as_completed(future_to_chip):
43+
chip = future_to_chip[future]
44+
pbar.update(n=1)
45+
46+
47+
if __name__ == "__main__":
48+
parser = argparse.ArgumentParser(description='Downloading corresponding JRC permanent water tile from GEE for a given raster chip')
49+
parser.add_argument(
50+
"--cld_projid",
51+
required=True,
52+
type=str,
53+
help="Cloud project id",
54+
)
55+
parser.add_argument(
56+
"--in_dir",
57+
type=str,
58+
required=True,
59+
help="Input tile directory",
60+
)
61+
parser.add_argument(
62+
"--out_dir",
63+
type=str,
64+
required=True,
65+
help="Output folder for corresponding permanent water tiles",
66+
)
67+
args = parser.parse_args()
68+
69+
main(args)

helpers.py

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
import numpy as np
2+
import os
3+
import rasterio
4+
from rasterio.warp import calculate_default_transform, reproject, Resampling,transform_bounds,transform_geom
5+
from rasterio.transform import Affine
6+
import restee as ree
7+
8+
def tile256(raster_path,output_dir):
9+
"""
10+
Create raster tiles of 256 by 256
11+
Arguments:
12+
raster_path: input raster path
13+
output_dir: output folder path
14+
Returns:
15+
256 by 256 raster tiles
16+
"""
17+
#Make outdir if is not created
18+
os.makedirs(output_dir, exist_ok=True)
19+
20+
21+
with rasterio.open(raster_path) as src:
22+
width = src.width
23+
height = src.height
24+
res = src.transform[0]
25+
UL_c,UL_r = src.transform * (0, 0)
26+
band = src.read(1)
27+
kwargs = src.meta.copy()
28+
29+
for col in range(int(np.round_(width/256))):
30+
for row in range(int(np.round_(height/256))):
31+
tf_tile = Affine(res, 0.0,UL_c+(256*res*col),0.0,-1*(res),UL_r+(256*(-1*(res))*row))
32+
id = int(raster_path.split(os.sep)[-2].split('_')[-1])
33+
tile_name = f'{id:03d}_{row:01d}{col:01d}.tif'
34+
np_arr = np.zeros((256, 256), dtype='float32')
35+
tile_band = band[256*row:256*row+256,256*col:256*col+256]
36+
np_arr[:tile_band.shape[0], :tile_band.shape[1]] = tile_band
37+
38+
kwargs.update({
39+
'transform': tf_tile,
40+
'width': 256,
41+
'height': 256,
42+
'compress': 'lzw',
43+
'dtype':'float32',})
44+
45+
with rasterio.open(os.path.join(output_dir,tile_name), "w", **kwargs) as dst:
46+
dst.write(np_arr,1)
47+
48+
np_arr = None
49+
tile_band = None
50+
band = None
51+
52+
def restgee_data(input_tile,ee_img,ee_img_band,output_dir,restee_session):
53+
"""
54+
Download GEE objectecs correspond to raster tile extent using EE REST API(restee package)
55+
Arguments:
56+
input_tile: Input calibrated raster tile
57+
ee_img: GEE object
58+
ee_img_band: GEE object band name
59+
output_dir: output directory path
60+
restee_session : rest ee session
61+
Returns:
62+
GEE objectecs as tiles correspoding to calibrated raster tiles
63+
"""
64+
with rasterio.open(input_tile) as src:
65+
kwargs = src.meta.copy()
66+
tile_tranform = src.transform
67+
tile_bounds = src.bounds
68+
tile_crs = src.crs
69+
gee_domain = ree.Domain((tile_bounds[0],
70+
tile_bounds[1],
71+
tile_bounds[2],
72+
tile_bounds[3]),
73+
resolution= src.transform[0],
74+
crs =str(tile_crs))
75+
76+
band_utm = np.int32(ree.img_to_ndarray(restee_session, gee_domain, image=ee_img, bands=ee_img_band))
77+
gdal_mask = src.dataset_mask()
78+
gdal_mask = np.int32(np.where(gdal_mask == 0, gdal_mask, 1))
79+
band_utm = band_utm*gdal_mask
80+
81+
dst_tranform = rasterio.transform.from_bounds(tile_bounds.left,
82+
tile_bounds.bottom,
83+
tile_bounds.right,
84+
tile_bounds.top,
85+
band_utm.shape[1],
86+
band_utm.shape[0])
87+
out_file_path = os.path.join(output_dir,os.path.basename(input_tile))
88+
with rasterio.open(out_file_path, "w", **kwargs) as dst:
89+
reproject(
90+
source=band_utm,
91+
destination=rasterio.band(dst,1),
92+
src_transform=dst_tranform,
93+
src_crs=tile_crs,
94+
dst_transform=tile_tranform,
95+
dst_crs=tile_crs,
96+
resampling=Resampling.nearest)
97+

0 commit comments

Comments
 (0)