diff --git a/README.md b/README.md index cab97f9..965ba9d 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,27 @@ # CoastSat -[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.2779294.svg)](https://doi.org/10.5281/zenodo.2779294) - CoastSat is an open-source software toolkit written in Python that enables users to obtain time-series of shoreline position at any coastline worldwide from 30+ years (and growing) of publicly available satellite imagery. ![Alt text](https://github.com/kvos/CoastSat/blob/development/examples/doc/example.gif) -The underlying approach and application of the CoastSat toolkit are described in detail in: +The underlying approach of the CoastSat toolkit are described in detail in: + +* Vos K., Splinter K.D., Harley M.D., Simmons J.A., Turner I.L. (2019). CoastSat: a Google Earth Engine-enabled Python toolkit to extract shorelines from publicly available satellite imagery. *Environmental Modelling and Software*. 122, 104528. https://doi.org/10.1016/j.envsoft.2019.104528 -*Vos K., Splinter K.D., Harley M.D., Simmons J.A., Turner I.L. (submitted). CoastSat: a Google Earth Engine-enabled Python toolkit to extract shorelines from publicly available satellite imagery, Environmental Modelling and Software*. +Example applications and accuracy of the resulting satellite-derived shorelines are discussed in: +* Vos K., Harley M.D., Splinter K.D., Simmons J.A., Turner I.L. (2019). Sub-annual to multi-decadal shoreline variability from publicly available satellite imagery. *Coastal Engineering*. 150, 160–174. https://doi.org/10.1016/j.coastaleng.2019.04.004 + +### Description -There are three main steps: +Satellite remote sensing can provide low-cost long-term shoreline data capable of resolving the temporal scales of interest to coastal scientists and engineers at sites where no in-situ field measurements are available. CoastSat enables the non-expert user to extract shorelines from Landsat 5, Landsat 7, Landsat 8 and Sentinel-2 images. +The shoreline detection algorithm implemented in CoastSat is optimised for sandy beach coastlines. It combines a sub-pixel border segmentation and an image classification component, which refines the segmentation into four distinct categories such that the shoreline detection is specific to the sand/water interface. + +The toolbox has three main functionalities: - assisted retrieval from Google Earth Engine of all available satellite images spanning the user-defined region of interest and time period - automated extraction of shorelines from all the selected images using a sub-pixel resolution technique - intersection of the 2D shorelines with user-defined shore-normal transects - -### Description - -Satellite remote sensing can provide low-cost long-term shoreline data capable of resolving the temporal scales of interest to coastal scientists and engineers at sites where no in-situ field measurements are available. Satellite imagery spanning the last 30 years with constant revisit periods is publicly available and suitable to extract repeated measurements of the shoreline position. -CoastSat enables the non-expert user to extract shorelines from Landsat 5, Landsat 7, Landsat 8 and Sentinel-2 images. -The shoreline detection algorithm implemented in CoastSat is optimised for sandy beach coastlines. It combines a sub-pixel border segmentation and an image classification component, which refines the segmentation into four distinct categories such that the shoreline detection is specific to the sand/water interface. +**If you like the repo put a star on it!** ## 1. Installation @@ -44,21 +45,17 @@ conda activate coastsat To confirm that you have successfully activated CoastSat, your terminal command line prompt should now start with (coastsat). -**In case errors are raised:**: you should create a new environment and manually install the required packages, which are listed in the environment.yml file. The following [link](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) shows how to create and manage an environment with Anaconda. +**In case errors are raised:**: open the **Anaconda Navigator**, in the *Environments* tab click on *Import* and select the *environment.yml* file. For more details, the following [link](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) shows how to create and manage an environment with Anaconda. ### 1.2 Activate Google Earth Engine Python API -Go to https://earthengine.google.com and sign up to Google Earth Engine (GEE). - -![gee_capture](https://user-images.githubusercontent.com/7217258/49348457-a9271300-f6f9-11e8-8c0b-407383940e94.jpg) - -Once you have created a Google Earth Engine account, go back to the Anaconda prompt and link your GEE credentials to the Python API: +With the `coastsat` environment activated, run the following command on the Anaconda Prompt to link your environment to the GEE server: ``` earthengine authenticate ``` -A web browser will open, login with your GEE credential and accept the terms and conditions. Then copy the authorization code into the Anaconda terminal. +A web browser will open, login with a gmail account and accept the terms and conditions. Then copy the authorization code into the Anaconda terminal. Now you are ready to start using the CoastSat toolbox! @@ -78,9 +75,9 @@ The following sections guide the reader through the different functionalities of If using `example.py` on **Spyder**, make sure that the Graphics Backend is set to **Automatic** and not **Inline** (as this mode doesn't allow to interact with the figures). To change this setting go under Preferences>IPython console>Graphics. -To run a Jupyter Notebook, place your cursor inside one of the code sections and then click on the `run` button up in the top menu to run that section and progress forward (as shown in the animation below). +A Jupyter Notebook combines formatted text and code. To run the code, place your cursor inside one of the code sections and click on the `run cell` button (shown below) and progress forward. -![example_jupyter](https://user-images.githubusercontent.com/7217258/49705486-8dc88480-fc72-11e8-8300-c342baaf54eb.gif) +![run_cell](https://user-images.githubusercontent.com/7217258/60766570-c2100080-a0ee-11e9-9675-e2aeba87e4a7.png) ### 2.1 Retrieval of the satellite images @@ -104,24 +101,22 @@ The screenshot below shows an example of inputs that will retrieve all the image To map the shorelines, the following user-defined settings are needed: - `cloud_thresh`: threshold on maximum cloud cover that is acceptable on the images (value between 0 and 1 - this may require some initial experimentation). - `output_epsg`: epsg code defining the spatial reference system of the shoreline coordinates. It has to be a cartesian coordinate system (i.e. projected) and not a geographical coordinate system (in latitude and longitude angles). See http://spatialreference.org/ to find the EPSG number corresponding to your local coordinate system. -- `check_detection`: if set to `True` the user can quality control each shoreline detection interactively. +- `check_detection`: if set to `True` the user can quality control each shoreline detection interactively (recommended when mapping shorelines for the first time). - `save_figure`: if set to `True` a figure of each mapped shoreline is saved (under *filepath/sitename/jpg_files/detection*). Note that this may slow down the process. - The setting `check_detection` is recommended when using the tool for the first time as it will show the user how CoastSat is mapping the shorelines. - -There are additional parameters (`min_beach_size`, `buffer_size`, `min_length_sl`, `cloud_mask_issue` and `dark sand`) that can be tuned to optimise the shoreline detection (for Advanced users only). For the moment leave these parameters set to their default values, we will see later how they can be modified. +There are additional parameters (`min_beach_size`, `buffer_size`, `min_length_sl`, `cloud_mask_issue` and `sand_color`) that can be tuned to optimise the shoreline detection (for Advanced users only). For the moment leave these parameters set to their default values, we will see later how they can be modified. An example of settings is provided here: -![doc2](https://user-images.githubusercontent.com/7217258/56278918-7a5e8600-614a-11e9-9184-77b69427b834.PNG) +![settings](https://user-images.githubusercontent.com/7217258/65950715-f68f2080-e481-11e9-80b6-19e13f2ec179.PNG) Once all the settings have been defined, the batch shoreline detection can be launched by calling: ``` output = SDS_shoreline.extract_shorelines(metadata, settings) ``` -When `check_detection` is set to `True`, a figure like the one below appears and asks the user to manually accept/reject each detection by pressing the `right arrow` (⇨) to `keep` the shoreline or `left arrow` (⇦) to `skip` the mapped shoreline. The user can break the loop at any time by pressing `escape` (nothing will be saved though). +When `check_detection` is set to `True`, a figure like the one below appears and asks the user to manually accept/reject each detection by pressing **on the keyboard** the `right arrow` (⇨) to `keep` the shoreline or `left arrow` (⇦) to `skip` the mapped shoreline. The user can break the loop at any time by pressing `escape` (nothing will be saved though). -![Alt text](https://github.com/kvos/CoastSat/blob/development/examples/doc/batch_detection.gif) +![map_shorelines](https://user-images.githubusercontent.com/7217258/60766769-fafda480-a0f1-11e9-8f91-419d848ff98d.gif) Once all the shorelines have been mapped, the output is available in two different formats (saved under *.\data\sitename*): - `sitename_output.pkl`: contains a list with the shoreline coordinates, the exact timestamp at which the image was captured (UTC time), the geometric accuracy and the cloud cover of each individual image. This list can be manipulated with Python, a snippet of code to plot the results is provided in the example script. @@ -133,7 +128,7 @@ The figure below shows how the satellite-derived shorelines can be opened in a G #### Reference shoreline -There is also an option to manually digitize a reference shoreline before running the batch shoreline detection on all the images. This reference shoreline helps to reject outliers and false detections when mapping shorelines as it only considers as valid shorelines the points that are within a distance from this reference shoreline. +Before running the batch shoreline detection, there is the option to manually digitize a reference shoreline on one cloud-free image. This reference shoreline helps to reject outliers and false detections when mapping shorelines as it only considers as valid shorelines the points that are within a defined distance from this reference shoreline. The user can manually digitize a reference shoreline on one of the images by calling: ``` @@ -142,9 +137,9 @@ settings['max_dist_ref'] = 100 # max distance (in meters) allowed from the refer ``` This function allows the user to click points along the shoreline on one of the satellite images, as shown in the animation below. -![ref_shoreline](https://user-images.githubusercontent.com/7217258/49710753-94b1c000-fc8f-11e8-9b6c-b5e96aadc5c9.gif) +![reference_shoreline](https://user-images.githubusercontent.com/7217258/60766913-6c8a2280-a0f3-11e9-89e5-865e11aa26cd.gif) -The maximum distance (in metres) allowed from the reference shoreline is defined by the parameter `max_dist_ref`. This parameter is set to a default value of 100 m. If you think that 100m buffer from the reference shoreline will not capture the shoreline variability at your site, increase the value of this parameter. This may be the case for large nourishments or eroding/accreting coastlines. +The maximum distance (in metres) allowed from the reference shoreline is defined by the parameter `max_dist_ref`. This parameter is set to a default value of 100 m. If you think that 100 m buffer from the reference shoreline will not capture the shoreline variability at your site, increase the value of this parameter. This may be the case for large nourishments or eroding/accreting coastlines. #### Advanced shoreline detection parameters @@ -153,7 +148,7 @@ As mentioned above, there are some additional parameters that can be modified to - `buffer_size`: radius (in metres) that defines the buffer around sandy pixels that is considered to calculate the sand/water threshold. The default value of `buffer_size` is 150 m. This parameter should be increased if you have a very wide (>150 m) surf zone or inter-tidal zone. - `min_length_sl`: minimum length (in metres) of shoreline perimeter to be valid. This can be used to discard small features that are detected but do not correspond to the actual shoreline. The default value is 200 m. If the shoreline that you are trying to map is shorter than 200 m, decrease the value of this parameter. - `cloud_mask_issue`: the cloud mask algorithm applied to Landsat images by USGS, namely CFMASK, does have difficulties sometimes with very bright features such as beaches or white-water in the ocean. This may result in pixels corresponding to a beach being identified as clouds and appear as masked pixels on your images. If this issue seems to be present in a large proportion of images from your local beach, you can switch this parameter to `True` and CoastSat will remove from the cloud mask the pixels that form very thin linear features, as often these are beaches and not clouds. Only activate this parameter if you observe this very specific cloud mask issue, otherwise leave to the default value of `False`. -- `dark_sand`: if your beach has dark sand (grey/black sand beaches), you can set this parameter to `True` and the classifier will be able to pick up the dark sand. At this stage this option is only available for Landsat images (soon for Sentinel-2 as well). +- `sand_color`: this parameter can take 3 values: `default`, `dark` or `bright`. Only change this parameter if you are seing that with the `default` the sand pixels are not being classified as sand (in orange). If your beach has dark sand (grey/black sand beaches), you can set this parameter to `dark` and the classifier will be able to pick up the dark sand. On the other hand, if your beach has white sand and the `default` classifier is not picking it up, switch this parameter to `bright`. At this stage this option is only available for Landsat images (soon for Sentinel-2 as well). ### 2.3 Shoreline change analysis @@ -191,12 +186,17 @@ An example is shown in the animation below: Having a problem? Post an issue in the [Issues page](https://github.com/kvos/coastsat/issues) (please do not email). ## Contributing +If you are willing to contribute, check out our todo list in the [Projects page](https://github.com/kvos/CoastSat/projects/1). 1. Fork the repository (https://github.com/kvos/coastsat/fork). A fork is a copy on which you can make your changes. 2. Create a new branch on your fork 3. Commit your changes and push them to your branch -4. When the branch is ready to be merged, create a Pull Request +4. When the branch is ready to be merged, create a Pull Request (how to make a clean pull request explained [here](https://gist.github.com/MarcDiethelm/7303312)) + +## References + +1. Vos K., Harley M.D., Splinter K.D., Simmons J.A., Turner I.L. (2019). Sub-annual to multi-decadal shoreline variability from publicly available satellite imagery. *Coastal Engineering*. 150, 160–174. https://doi.org/10.1016/j.coastaleng.2019.04.004 -Check the following link for more information on how to make a clean pull request: https://gist.github.com/MarcDiethelm/7303312). +2. Vos K., Splinter K.D.,Harley M.D., Simmons J.A., Turner I.L. (2019). CoastSat: a Google Earth Engine-enabled Python toolkit to extract shorelines from publicly available satellite imagery. *Environmental Modelling and Software*. 122, 104528. https://doi.org/10.1016/j.envsoft.2019.104528 -If you like the repo put a star on it! +3. Training dataset used for pixel classification in CoastSat: https://doi.org/10.5281/zenodo.3334147 diff --git a/coastsat/SDS_download.py b/coastsat/SDS_download.py index a581efe..224aea8 100644 --- a/coastsat/SDS_download.py +++ b/coastsat/SDS_download.py @@ -29,9 +29,6 @@ from coastsat import SDS_preprocess, SDS_tools np.seterr(all='ignore') # raise/ignore divisions by 0 and nans -# initialise connection with GEE server -ee.Initialize() - def download_tif(image, polygon, bandsId, filepath): """ Downloads a .TIF image from the ee server and stores it in a temp file @@ -99,6 +96,9 @@ def retrieve_images(inputs): """ + # initialise connection with GEE server + ee.Initialize() + # read inputs dictionnary sitename = inputs['sitename'] polygon = inputs['polygon'] @@ -131,14 +131,21 @@ def retrieve_images(inputs): if not os.path.exists(filepath): os.makedirs(filepath) if not os.path.exists(filepath_meta): - os.makedirs(filepath_meta) + os.makedirs(filepath_meta) # Landsat 5 collection - input_col = ee.ImageCollection('LANDSAT/LT05/C01/T1_TOA') - # filter by location and dates - flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1]) - # get all images in the filtered collection - im_all = flt_col.getInfo().get('features') + count_loop = 0 + while count_loop < 1: + try: + input_col = ee.ImageCollection('LANDSAT/LT05/C01/T1_TOA') + # filter by location and dates + flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1]) + # get all images in the filtered collection + im_all = flt_col.getInfo().get('features') + count_loop = 1 + except: + count_loop = 0 + # remove very cloudy images (>95% cloud) cloud_cover = [_['properties']['CLOUD_COVER'] for _ in im_all] if np.any([_ > 95 for _ in cloud_cover]): @@ -157,9 +164,14 @@ def retrieve_images(inputs): all_names = [] im_epsg = [] for i in range(n_img): - - # find each image in ee database - im = ee.Image(im_col[i]['id']) + count_loop = 0 + while count_loop < 1: + try: + # find each image in ee database + im = ee.Image(im_col[i]['id']) + count_loop = 1 + except: + count_loop = 0 # read metadata im_dic = im_col[i] # get bands @@ -189,7 +201,13 @@ def retrieve_images(inputs): all_names.append(filename) filenames.append(filename) # download .TIF image - local_data = download_tif(im, polygon, ms_bands, filepath) + count_loop = 0 + while count_loop < 1: + try: + local_data = download_tif(im, polygon, ms_bands, filepath) + count_loop = 1 + except: + count_loop = 0 # update filename try: os.rename(local_data, os.path.join(filepath, filename)) @@ -237,11 +255,18 @@ def retrieve_images(inputs): os.makedirs(filepath_meta) # landsat 7 collection - input_col = ee.ImageCollection('LANDSAT/LE07/C01/T1_RT_TOA') - # filter by location and dates - flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1]) - # get all images in the filtered collection - im_all = flt_col.getInfo().get('features') + count_loop = 0 + while count_loop < 1: + try: + input_col = ee.ImageCollection('LANDSAT/LE07/C01/T1_RT_TOA') + # filter by location and dates + flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1]) + # get all images in the filtered collection + im_all = flt_col.getInfo().get('features') + count_loop = 1 + except: + count_loop = 0 + # remove very cloudy images (>95% cloud) cloud_cover = [_['properties']['CLOUD_COVER'] for _ in im_all] if np.any([_ > 95 for _ in cloud_cover]): @@ -261,8 +286,14 @@ def retrieve_images(inputs): im_epsg = [] for i in range(n_img): - # find each image in ee database - im = ee.Image(im_col[i]['id']) + count_loop = 0 + while count_loop < 1: + try: + # find each image in ee database + im = ee.Image(im_col[i]['id']) + count_loop = 1 + except: + count_loop = 0 # read metadata im_dic = im_col[i] # get bands @@ -295,8 +326,14 @@ def retrieve_images(inputs): all_names.append(filename_pan) filenames.append(filename_pan) # download .TIF image - local_data_pan = download_tif(im, polygon, pan_band, filepath_pan) - local_data_ms = download_tif(im, polygon, ms_bands, filepath_ms) + count_loop = 0 + while count_loop < 1: + try: + local_data_pan = download_tif(im, polygon, pan_band, filepath_pan) + local_data_ms = download_tif(im, polygon, ms_bands, filepath_ms) + count_loop = 1 + except: + count_loop = 0 # update filename try: os.rename(local_data_pan, os.path.join(filepath_pan, filename_pan)) @@ -349,11 +386,18 @@ def retrieve_images(inputs): os.makedirs(filepath_meta) # landsat 8 collection - input_col = ee.ImageCollection('LANDSAT/LC08/C01/T1_RT_TOA') - # filter by location and dates - flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1]) - # get all images in the filtered collection - im_all = flt_col.getInfo().get('features') + count_loop = 0 + while count_loop < 1: + try: + input_col = ee.ImageCollection('LANDSAT/LC08/C01/T1_RT_TOA') + # filter by location and dates + flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1]) + # get all images in the filtered collection + im_all = flt_col.getInfo().get('features') + count_loop = 1 + except: + count_loop = 0 + # remove very cloudy images (>95% cloud) cloud_cover = [_['properties']['CLOUD_COVER'] for _ in im_all] if np.any([_ > 95 for _ in cloud_cover]): @@ -373,8 +417,14 @@ def retrieve_images(inputs): im_epsg = [] for i in range(n_img): - # find each image in ee database - im = ee.Image(im_col[i]['id']) + count_loop = 0 + while count_loop < 1: + try: + # find each image in ee database + im = ee.Image(im_col[i]['id']) + count_loop = 1 + except: + count_loop = 0 # read metadata im_dic = im_col[i] # get bands @@ -407,8 +457,15 @@ def retrieve_images(inputs): all_names.append(filename_pan) filenames.append(filename_pan) # download .TIF image - local_data_pan = download_tif(im, polygon, pan_band, filepath_pan) - local_data_ms = download_tif(im, polygon, ms_bands, filepath_ms) + count_loop = 0 + while count_loop < 1: + try: + local_data_pan = download_tif(im, polygon, pan_band, filepath_pan) + local_data_ms = download_tif(im, polygon, ms_bands, filepath_ms) + count_loop = 1 + except: + count_loop = 0 + # update filename try: os.rename(local_data_pan, os.path.join(filepath_pan, filename_pan)) @@ -461,11 +518,18 @@ def retrieve_images(inputs): os.makedirs(filepath_meta) # Sentinel2 collection - input_col = ee.ImageCollection('COPERNICUS/S2') - # filter by location and dates - flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1]) - # get all images in the filtered collection - im_all = flt_col.getInfo().get('features') + count_loop = 0 + while count_loop < 1: + try: + input_col = ee.ImageCollection('COPERNICUS/S2') + # filter by location and dates + flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1]) + # get all images in the filtered collection + im_all = flt_col.getInfo().get('features') + count_loop = 1 + except: + count_loop = 0 + # remove duplicates in the collection (there are many in S2 collection) timestamps = [datetime.fromtimestamp(_['properties']['system:time_start']/1000, tz=pytz.utc) for _ in im_all] @@ -516,9 +580,14 @@ def retrieve_images(inputs): all_names = [] im_epsg = [] for i in range(n_img): - - # find each image in ee database - im = ee.Image(im_col[i]['id']) + count_loop = 0 + while count_loop < 1: + try: + # find each image in ee database + im = ee.Image(im_col[i]['id']) + count_loop = 1 + except: + count_loop = 0 # read metadata im_dic = im_col[i] # get bands @@ -547,21 +616,42 @@ def retrieve_images(inputs): filenames.append(filename10) # download .TIF image and update filename - local_data = download_tif(im, polygon, bands10, os.path.join(filepath, '10m')) + count_loop = 0 + while count_loop < 1: + try: + local_data = download_tif(im, polygon, bands10, os.path.join(filepath, '10m')) + count_loop = 1 + except: + count_loop = 0 + try: os.rename(local_data, os.path.join(filepath, '10m', filename10)) except: os.remove(os.path.join(filepath, '10m', filename10)) os.rename(local_data, os.path.join(filepath, '10m', filename10)) - local_data = download_tif(im, polygon, bands20, os.path.join(filepath, '20m')) + count_loop = 0 + while count_loop < 1: + try: + local_data = download_tif(im, polygon, bands20, os.path.join(filepath, '20m')) + count_loop = 1 + except: + count_loop = 0 + try: os.rename(local_data, os.path.join(filepath, '20m', filename20)) except: os.remove(os.path.join(filepath, '20m', filename20)) os.rename(local_data, os.path.join(filepath, '20m', filename20)) - local_data = download_tif(im, polygon, bands60, os.path.join(filepath, '60m')) + count_loop = 0 + while count_loop < 1: + try: + local_data = download_tif(im, polygon, bands60, os.path.join(filepath, '60m')) + count_loop = 1 + except: + count_loop = 0 + try: os.rename(local_data, os.path.join(filepath, '60m', filename60)) except: diff --git a/coastsat/SDS_preprocess.py b/coastsat/SDS_preprocess.py index 7b63ebf..4fe1d32 100644 --- a/coastsat/SDS_preprocess.py +++ b/coastsat/SDS_preprocess.py @@ -1,7 +1,7 @@ """This module contains all the functions needed to preprocess the satellite images before the -shoreline can be extracted. This includes creating a cloud mask and -pansharpening/downsampling the multispectral bands. - +shoreline can be extracted. This includes creating a cloud mask and +pansharpening/downsampling the multispectral bands. + Author: Kilian Vos, Water Research Laboratory, University of New South Wales """ @@ -32,9 +32,9 @@ np.seterr(all='ignore') # raise/ignore divisions by 0 and nans def create_cloud_mask(im_QA, satname, cloud_mask_issue): """ Creates a cloud mask using the information contained in the QA band. - + KV WRL 2018 - + Arguments: ----------- im_QA: np.array @@ -43,13 +43,13 @@ def create_cloud_mask(im_QA, satname, cloud_mask_issue): short name for the satellite (L5, L7, L8 or S2) cloud_mask_issue: boolean True if there is an issue with the cloud mask and sand pixels are being masked on the images - + Returns: ----------- cloud_mask : np.array A boolean array with True if a pixel is cloudy and False otherwise """ - + # convert QA bits (the bits allocated to cloud cover vary depending on the satellite mission) if satname == 'L8': cloud_values = [2800, 2804, 2808, 2812, 6896, 6900, 6904, 6908] @@ -57,26 +57,26 @@ def create_cloud_mask(im_QA, satname, cloud_mask_issue): cloud_values = [752, 756, 760, 764] elif satname == 'S2': cloud_values = [1024, 2048] # 1024 = dense cloud, 2048 = cirrus clouds - + # find which pixels have bits corresponding to cloud values cloud_mask = np.isin(im_QA, cloud_values) - - # remove cloud pixels that form very thin features. These are beach or swash pixels that are + + # remove cloud pixels that form very thin features. These are beach or swash pixels that are # erroneously identified as clouds by the CFMASK algorithm applied to the images by the USGS. if sum(sum(cloud_mask)) > 0 and sum(sum(~cloud_mask)) > 0: morphology.remove_small_objects(cloud_mask, min_size=10, connectivity=1, in_place=True) - + if cloud_mask_issue: - elem = morphology.square(3) # use a square of width 3 pixels + elem = morphology.square(3) # use a square of width 3 pixels cloud_mask = morphology.binary_opening(cloud_mask,elem) # perform image opening # remove objects with less than 25 connected pixels morphology.remove_small_objects(cloud_mask, min_size=25, connectivity=1, in_place=True) - + return cloud_mask def hist_match(source, template): """ - Adjust the pixel values of a grayscale image such that its histogram matches that of a + Adjust the pixel values of a grayscale image such that its histogram matches that of a target image. Arguments: @@ -118,11 +118,11 @@ def hist_match(source, template): def pansharpen(im_ms, im_pan, cloud_mask): """ - Pansharpens a multispectral image, using the panchromatic band and a cloud mask. + Pansharpens a multispectral image, using the panchromatic band and a cloud mask. A PCA is applied to the image, then the 1st PC is replaced with the panchromatic band. Note that it is essential to match the histrograms of the 1st PC and the panchromatic band before replacing and inverting the PCA. - + KV WRL 2018 Arguments: @@ -133,13 +133,13 @@ def pansharpen(im_ms, im_pan, cloud_mask): Panchromatic band (2D) cloud_mask: np.array 2D cloud mask with True where cloud pixels are - + Returns: ----------- im_ms_ps: np.ndarray Pansharpened multispectral image (3D) """ - + # reshape image into vector and apply cloud mask vec = im_ms.reshape(im_ms.shape[0] * im_ms.shape[1], im_ms.shape[2]) vec_mask = cloud_mask.reshape(im_ms.shape[0] * im_ms.shape[1]) @@ -147,13 +147,13 @@ def pansharpen(im_ms, im_pan, cloud_mask): # apply PCA to multispectral bands pca = decomposition.PCA() vec_pcs = pca.fit_transform(vec) - + # replace 1st PC with pan band (after matching histograms) vec_pan = im_pan.reshape(im_pan.shape[0] * im_pan.shape[1]) vec_pan = vec_pan[~vec_mask] vec_pcs[:,0] = hist_match(vec_pan, vec_pcs[:,0]) vec_ms_ps = pca.inverse_transform(vec_pcs) - + # reshape vector into image vec_ms_ps_full = np.ones((len(vec_mask), im_ms.shape[2])) * np.nan vec_ms_ps_full[~vec_mask,:] = vec_ms_ps @@ -167,7 +167,7 @@ def rescale_image_intensity(im, cloud_mask, prob_high): Rescales the intensity of an image (multispectral or single band) by applying a cloud mask and clipping the prob_high upper percentile. This functions allows to stretch the contrast of an image, only for visualisation purposes. - + KV WRL 2018 Arguments: @@ -178,23 +178,23 @@ def rescale_image_intensity(im, cloud_mask, prob_high): 2D cloud mask with True where cloud pixels are prob_high: float probability of exceedence used to calculate the upper percentile - + Returns: ----------- im_adj: np.array The rescaled image """ - + # lower percentile is set to 0 - prc_low = 0 - + prc_low = 0 + # reshape the 2D cloud mask into a 1D vector vec_mask = cloud_mask.reshape(im.shape[0] * im.shape[1]) - + # if image contains several bands, stretch the contrast for each band if len(im.shape) > 2: # reshape into a vector - vec = im.reshape(im.shape[0] * im.shape[1], im.shape[2]) + vec = im.reshape(im.shape[0] * im.shape[1], im.shape[2]) # initiliase with NaN values vec_adj = np.ones((len(vec_mask), im.shape[2])) * np.nan # loop through the bands @@ -207,7 +207,7 @@ def rescale_image_intensity(im, cloud_mask, prob_high): vec_adj[~vec_mask,i] = vec_rescaled # reshape into image im_adj = vec_adj.reshape(im.shape[0], im.shape[1], im.shape[2]) - + # if image only has 1 bands (grayscale image) else: vec = im.reshape(im.shape[0] * im.shape[1]) @@ -221,11 +221,11 @@ def rescale_image_intensity(im, cloud_mask, prob_high): def preprocess_single(fn, satname, cloud_mask_issue): """ - Reads the image and outputs the pansharpened/down-sampled multispectral bands, the - georeferencing vector of the image (coordinates of the upper left pixel), the cloud mask and - the QA band. For Landsat 7-8 it also outputs the panchromatic band and for Sentinel-2 it also + Reads the image and outputs the pansharpened/down-sampled multispectral bands, the + georeferencing vector of the image (coordinates of the upper left pixel), the cloud mask and + the QA band. For Landsat 7-8 it also outputs the panchromatic band and for Sentinel-2 it also outputs the 20m SWIR band. - + KV WRL 2018 Arguments: @@ -238,41 +238,41 @@ def preprocess_single(fn, satname, cloud_mask_issue): name of the satellite mission (e.g., 'L5') cloud_mask_issue: boolean True if there is an issue with the cloud mask and sand pixels are being masked on the images - + Returns: ----------- im_ms: np.array 3D array containing the pansharpened/down-sampled bands (B,G,R,NIR,SWIR1) georef: np.array - vector of 6 elements [Xtr, Xscale, Xshear, Ytr, Yshear, Yscale] defining the + vector of 6 elements [Xtr, Xscale, Xshear, Ytr, Yshear, Yscale] defining the coordinates of the top-left pixel of the image cloud_mask: np.array 2D cloud mask with True where cloud pixels are im_extra : np.array - 2D array containing the 20m resolution SWIR band for Sentinel-2 and the 15m resolution + 2D array containing the 20m resolution SWIR band for Sentinel-2 and the 15m resolution panchromatic band for Landsat 7 and Landsat 8. This field is empty for Landsat 5. im_QA: np.array 2D array containing the QA band, from which the cloud_mask can be computed. im_nodata: np.array 2D array with True where no data values (-inf) are located - + """ - + #=============================================================================================# # L5 images #=============================================================================================# if satname == 'L5': - + # read all bands data = gdal.Open(fn, gdal.GA_ReadOnly) georef = np.array(data.GetGeoTransform()) bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)] im_ms = np.stack(bands, 2) - + # down-sample to 15 m (half of the original pixel size) nrows = im_ms.shape[0]*2 ncols = im_ms.shape[1]*2 - + # create cloud mask im_QA = im_ms[:,:,5] im_ms = im_ms[:,:,:-1] @@ -284,7 +284,7 @@ def preprocess_single(fn, satname, cloud_mask_issue): # resize the image using nearest neighbour interpolation (order 0) cloud_mask = transform.resize(cloud_mask, (nrows, ncols), order=0, preserve_range=True, mode='constant').astype('bool_') - + # adjust georeferencing vector to the new image size # scale becomes 15m and the origin is adjusted to the center of new top left pixel georef[1] = 15 @@ -294,58 +294,70 @@ def preprocess_single(fn, satname, cloud_mask_issue): # check if -inf or nan values on any band and add to cloud mask im_nodata = np.zeros(cloud_mask.shape).astype(bool) - for k in range(im_ms.shape[2]): + for k in range(im_ms.shape[2]): im_inf = np.isin(im_ms[:,:,k], -np.inf) im_nan = np.isnan(im_ms[:,:,k]) cloud_mask = np.logical_or(np.logical_or(cloud_mask, im_inf), im_nan) - im_nodata = np.logical_or(im_nodata, im_inf) + im_nodata = np.logical_or(np.logical_or(im_nodata, im_inf), im_nan) + # check if there are pixels with 0 intensity in the Green, NIR and SWIR bands and add those + # to the cloud mask as otherwise they will cause errors when calculating the NDWI and MNDWI + im_zeros = np.ones(cloud_mask.shape).astype(bool) + for k in [1,3,4]: # loop through the Green, NIR and SWIR bands + im_zeros = np.logical_and(np.isin(im_ms[:,:,k],0), im_zeros) + # update cloud mask and nodata + cloud_mask = np.logical_or(im_zeros, cloud_mask) + im_nodata = np.logical_or(im_zeros, im_nodata) # no extra image for Landsat 5 (they are all 30 m bands) im_extra = [] - + #=============================================================================================# # L7 images - #=============================================================================================# + #=============================================================================================# elif satname == 'L7': - + # read pan image fn_pan = fn[0] data = gdal.Open(fn_pan, gdal.GA_ReadOnly) georef = np.array(data.GetGeoTransform()) bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)] im_pan = np.stack(bands, 2)[:,:,0] - + # size of pan image nrows = im_pan.shape[0] ncols = im_pan.shape[1] - + # read ms image fn_ms = fn[1] data = gdal.Open(fn_ms, gdal.GA_ReadOnly) bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)] im_ms = np.stack(bands, 2) - + # create cloud mask im_QA = im_ms[:,:,5] cloud_mask = create_cloud_mask(im_QA, satname, cloud_mask_issue) - + # resize the image using bilinear interpolation (order 1) im_ms = im_ms[:,:,:5] im_ms = transform.resize(im_ms,(nrows, ncols), order=1, preserve_range=True, mode='constant') # resize the image using nearest neighbour interpolation (order 0) cloud_mask = transform.resize(cloud_mask, (nrows, ncols), order=0, preserve_range=True, - mode='constant').astype('bool_') + mode='constant').astype('bool_') # check if -inf or nan values on any band and eventually add those pixels to cloud mask im_nodata = np.zeros(cloud_mask.shape).astype(bool) - for k in range(im_ms.shape[2]+1): - if k == 5: - im_inf = np.isin(im_pan, -np.inf) - im_nan = np.isnan(im_pan) - else: - im_inf = np.isin(im_ms[:,:,k], -np.inf) - im_nan = np.isnan(im_ms[:,:,k]) + for k in range(im_ms.shape[2]): + im_inf = np.isin(im_ms[:,:,k], -np.inf) + im_nan = np.isnan(im_ms[:,:,k]) cloud_mask = np.logical_or(np.logical_or(cloud_mask, im_inf), im_nan) - im_nodata = np.logical_or(im_nodata, im_inf) + im_nodata = np.logical_or(np.logical_or(im_nodata, im_inf), im_nan) + # check if there are pixels with 0 intensity in the Green, NIR and SWIR bands and add those + # to the cloud mask as otherwise they will cause errors when calculating the NDWI and MNDWI + im_zeros = np.ones(cloud_mask.shape).astype(bool) + for k in [1,3,4]: # loop through the Green, NIR and SWIR bands + im_zeros = np.logical_and(np.isin(im_ms[:,:,k],0), im_zeros) + # update cloud mask and nodata + cloud_mask = np.logical_or(im_zeros, cloud_mask) + im_nodata = np.logical_or(im_zeros, im_nodata) # pansharpen Green, Red, NIR (where there is overlapping with pan band in L7) try: @@ -355,55 +367,59 @@ def preprocess_single(fn, satname, cloud_mask_issue): # add downsampled Blue and SWIR1 bands im_ms_ps = np.append(im_ms[:,:,[0]], im_ms_ps, axis=2) im_ms_ps = np.append(im_ms_ps, im_ms[:,:,[4]], axis=2) - + im_ms = im_ms_ps.copy() # the extra image is the 15m panchromatic band im_extra = im_pan - + #=============================================================================================# # L8 images - #=============================================================================================# + #=============================================================================================# elif satname == 'L8': - + # read pan image fn_pan = fn[0] data = gdal.Open(fn_pan, gdal.GA_ReadOnly) georef = np.array(data.GetGeoTransform()) bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)] im_pan = np.stack(bands, 2)[:,:,0] - + # size of pan image nrows = im_pan.shape[0] ncols = im_pan.shape[1] - + # read ms image fn_ms = fn[1] data = gdal.Open(fn_ms, gdal.GA_ReadOnly) bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)] im_ms = np.stack(bands, 2) - + # create cloud mask im_QA = im_ms[:,:,5] cloud_mask = create_cloud_mask(im_QA, satname, cloud_mask_issue) - + # resize the image using bilinear interpolation (order 1) im_ms = im_ms[:,:,:5] im_ms = transform.resize(im_ms,(nrows, ncols), order=1, preserve_range=True, mode='constant') # resize the image using nearest neighbour interpolation (order 0) cloud_mask = transform.resize(cloud_mask, (nrows, ncols), order=0, preserve_range=True, - mode='constant').astype('bool_') + mode='constant').astype('bool_') # check if -inf or nan values on any band and eventually add those pixels to cloud mask im_nodata = np.zeros(cloud_mask.shape).astype(bool) - for k in range(im_ms.shape[2]+1): - if k == 5: - im_inf = np.isin(im_pan, -np.inf) - im_nan = np.isnan(im_pan) - else: - im_inf = np.isin(im_ms[:,:,k], -np.inf) - im_nan = np.isnan(im_ms[:,:,k]) + for k in range(im_ms.shape[2]): + im_inf = np.isin(im_ms[:,:,k], -np.inf) + im_nan = np.isnan(im_ms[:,:,k]) cloud_mask = np.logical_or(np.logical_or(cloud_mask, im_inf), im_nan) - im_nodata = np.logical_or(im_nodata, im_inf) + im_nodata = np.logical_or(np.logical_or(im_nodata, im_inf), im_nan) + # check if there are pixels with 0 intensity in the Green, NIR and SWIR bands and add those + # to the cloud mask as otherwise they will cause errors when calculating the NDWI and MNDWI + im_zeros = np.ones(cloud_mask.shape).astype(bool) + for k in [1,3,4]: # loop through the Green, NIR and SWIR bands + im_zeros = np.logical_and(np.isin(im_ms[:,:,k],0), im_zeros) + # update cloud mask and nodata + cloud_mask = np.logical_or(im_zeros, cloud_mask) + im_nodata = np.logical_or(im_zeros, im_nodata) # pansharpen Blue, Green, Red (where there is overlapping with pan band in L8) try: @@ -412,16 +428,16 @@ def preprocess_single(fn, satname, cloud_mask_issue): im_ms_ps = im_ms[:,:,[0,1,2]] # add downsampled NIR and SWIR1 bands im_ms_ps = np.append(im_ms_ps, im_ms[:,:,[3,4]], axis=2) - + im_ms = im_ms_ps.copy() # the extra image is the 15m panchromatic band im_extra = im_pan - + #=============================================================================================# # S2 images - #=============================================================================================# + #=============================================================================================# if satname == 'S2': - + # read 10m bands (R,G,B,NIR) fn10 = fn[0] data = gdal.Open(fn10, gdal.GA_ReadOnly) @@ -429,7 +445,7 @@ def preprocess_single(fn, satname, cloud_mask_issue): bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)] im10 = np.stack(bands, 2) im10 = im10/10000 # TOA scaled to 10000 - + # if image contains only zeros (can happen with S2), skip the image if sum(sum(sum(im10))) < 1: im_ms = [] @@ -437,11 +453,11 @@ def preprocess_single(fn, satname, cloud_mask_issue): # skip the image by giving it a full cloud_mask cloud_mask = np.ones((im10.shape[0],im10.shape[1])).astype('bool') return im_ms, georef, cloud_mask, [], [], [] - + # size of 10m bands nrows = im10.shape[0] ncols = im10.shape[1] - + # read 20m band (SWIR1) fn20 = fn[1] data = gdal.Open(fn20, gdal.GA_ReadOnly) @@ -449,15 +465,15 @@ def preprocess_single(fn, satname, cloud_mask_issue): im20 = np.stack(bands, 2) im20 = im20[:,:,0] im20 = im20/10000 # TOA scaled to 10000 - + # resize the image using bilinear interpolation (order 1) im_swir = transform.resize(im20, (nrows, ncols), order=1, preserve_range=True, mode='constant') im_swir = np.expand_dims(im_swir, axis=2) - + # append down-sampled SWIR1 band to the other 10m bands im_ms = np.append(im10, im_swir, axis=2) - + # create cloud mask using 60m QA band (not as good as Landsat cloud cover) fn60 = fn[2] data = gdal.Open(fn60, gdal.GA_ReadOnly) @@ -470,23 +486,32 @@ def preprocess_single(fn, satname, cloud_mask_issue): mode='constant') # check if -inf or nan values on any band and add to cloud mask im_nodata = np.zeros(cloud_mask.shape).astype(bool) - for k in range(im_ms.shape[2]): + for k in range(im_ms.shape[2]): im_inf = np.isin(im_ms[:,:,k], -np.inf) im_nan = np.isnan(im_ms[:,:,k]) cloud_mask = np.logical_or(np.logical_or(cloud_mask, im_inf), im_nan) - im_nodata = np.logical_or(im_nodata, im_inf) - + im_nodata = np.logical_or(np.logical_or(im_nodata, im_inf), im_nan) + + # check if there are pixels with 0 intensity in the Green, NIR and SWIR bands and add those + # to the cloud mask as otherwise they will cause errors when calculating the NDWI and MNDWI + im_zeros = np.ones(cloud_mask.shape).astype(bool) + for k in [1,3,4]: # loop through the Green, NIR and SWIR bands + im_zeros = np.logical_and(np.isin(im_ms[:,:,k],0), im_zeros) + # update cloud mask and nodata + cloud_mask = np.logical_or(im_zeros, cloud_mask) + im_nodata = np.logical_or(im_zeros, im_nodata) + # the extra image is the 20m SWIR band im_extra = im20 - + return im_ms, georef, cloud_mask, im_extra, im_QA, im_nodata - + def create_jpg(im_ms, cloud_mask, date, satname, filepath): """ Saves a .jpg file with the RGB image as well as the NIR and SWIR1 grayscale images. This functions can be modified to obtain different visualisations of the multispectral images. - + KV WRL 2018 Arguments: @@ -499,18 +524,18 @@ def create_jpg(im_ms, cloud_mask, date, satname, filepath): String containing the date at which the image was acquired satname: str name of the satellite mission (e.g., 'L5') - + Returns: ----------- Saves a .jpg image corresponding to the preprocessed satellite image - + """ # rescale image intensity for display purposes im_RGB = rescale_image_intensity(im_ms[:,:,[2,1,0]], cloud_mask, 99.9) # im_NIR = rescale_image_intensity(im_ms[:,:,3], cloud_mask, 99.9) # im_SWIR = rescale_image_intensity(im_ms[:,:,4], cloud_mask, 99.9) - + # make figure (just RGB) fig = plt.figure() fig.set_size_inches([18,9]) @@ -519,7 +544,7 @@ def create_jpg(im_ms, cloud_mask, date, satname, filepath): ax1.axis('off') ax1.imshow(im_RGB) ax1.set_title(date + ' ' + satname, fontsize=16) - + # if im_RGB.shape[1] > 2*im_RGB.shape[0]: # ax1 = fig.add_subplot(311) # ax2 = fig.add_subplot(312) @@ -527,7 +552,7 @@ def create_jpg(im_ms, cloud_mask, date, satname, filepath): # else: # ax1 = fig.add_subplot(131) # ax2 = fig.add_subplot(132) -# ax3 = fig.add_subplot(133) +# ax3 = fig.add_subplot(133) # # RGB # ax1.axis('off') # ax1.imshow(im_RGB) @@ -540,18 +565,18 @@ def create_jpg(im_ms, cloud_mask, date, satname, filepath): # ax3.axis('off') # ax3.imshow(im_SWIR, cmap='seismic') # ax3.set_title('Short-wave Infrared', fontsize=16) - + # save figure plt.rcParams['savefig.jpeg_quality'] = 100 fig.savefig(os.path.join(filepath, - date + '_' + satname + '.jpg'), dpi=150) + date + '_' + satname + '.jpg'), dpi=150) plt.close() - - + + def save_jpg(metadata, settings): """ Saves a .jpg image for all the images contained in metadata. - + KV WRL 2018 Arguments: @@ -566,27 +591,27 @@ def save_jpg(metadata, settings): name of the site (also name of the folder where the images are stored) cloud_mask_issue: boolean True if there is an issue with the cloud mask and sand pixels are being masked on the images - + Returns: ----------- - + """ - + sitename = settings['inputs']['sitename'] cloud_thresh = settings['cloud_thresh'] filepath_data = settings['inputs']['filepath'] - + # create subfolder to store the jpg files filepath_jpg = os.path.join(filepath_data, sitename, 'jpg_files', 'preprocessed') - if not os.path.exists(filepath_jpg): + if not os.path.exists(filepath_jpg): os.makedirs(filepath_jpg) - + # loop through satellite list for satname in metadata.keys(): - + filepath = SDS_tools.get_filepath(settings['inputs'],satname) filenames = metadata[satname]['filenames'] - + # loop through images for i in range(len(filenames)): # image filename @@ -602,17 +627,17 @@ def save_jpg(metadata, settings): # save .jpg with date and satellite in the title date = filenames[i][:19] create_jpg(im_ms, cloud_mask, date, satname, filepath_jpg) - + # print the location where the images have been saved print('Satellite images saved as .jpg in ' + os.path.join(filepath_data, sitename, 'jpg_files', 'preprocessed')) - + def get_reference_sl(metadata, settings): """ - Allows the user to manually digitize a reference shoreline that is used seed the shoreline + Allows the user to manually digitize a reference shoreline that is used seed the shoreline detection algorithm. The reference shoreline helps to detect the outliers, making the shoreline detection more robust. - + KV WRL 2018 Arguments: @@ -627,17 +652,17 @@ def get_reference_sl(metadata, settings): name of the site (also name of the folder where the images are stored) 'output_epsg': int epsg code of the desired spatial reference system - + Returns: ----------- reference_shoreline: np.array coordinates of the reference shoreline that was manually digitized - + """ - + sitename = settings['inputs']['sitename'] filepath_data = settings['inputs']['filepath'] - + # check if reference shoreline already exists in the corresponding folder filepath = os.path.join(filepath_data, sitename) filename = sitename + '_reference_shoreline.pkl' @@ -646,7 +671,7 @@ def get_reference_sl(metadata, settings): with open(os.path.join(filepath, sitename + '_reference_shoreline.pkl'), 'rb') as f: refsl = pickle.load(f) return refsl - + else: # first try to use S2 images (10m res for manually digitizing the reference shoreline) if 'S2' in metadata.keys(): @@ -657,62 +682,62 @@ def get_reference_sl(metadata, settings): elif not 'S2' in metadata.keys() and 'L8' in metadata.keys(): satname = 'L8' filepath = SDS_tools.get_filepath(settings['inputs'],satname) - filenames = metadata[satname]['filenames'] - # if no S2 images and no L8, use L5 images (L7 images have black diagonal bands making it + filenames = metadata[satname]['filenames'] + # if no S2 images and no L8, use L5 images (L7 images have black diagonal bands making it # hard to manually digitize a shoreline) elif not 'S2' in metadata.keys() and not 'L8' in metadata.keys() and 'L5' in metadata.keys(): satname = 'L5' filepath = SDS_tools.get_filepath(settings['inputs'],satname) - filenames = metadata[satname]['filenames'] + filenames = metadata[satname]['filenames'] else: - raise Exception('You cannot digitize the shoreline on L7 images, add another L8, S2 or L5 to your dataset.') - + raise Exception('You cannot digitize the shoreline on L7 images, add another L8, S2 or L5 to your dataset.') + # loop trhough the images for i in range(len(filenames)): - + # read image fn = SDS_tools.get_filenames(filenames[i],filepath, satname) im_ms, georef, cloud_mask, im_extra, im_QA, im_nodata = preprocess_single(fn, satname, settings['cloud_mask_issue']) - + # calculate cloud cover cloud_cover = np.divide(sum(sum(cloud_mask.astype(int))), (cloud_mask.shape[0]*cloud_mask.shape[1])) - + # skip image if cloud cover is above threshold if cloud_cover > settings['cloud_thresh']: continue - + # rescale image intensity for display purposes im_RGB = rescale_image_intensity(im_ms[:,:,[2,1,0]], cloud_mask, 99.9) - + # plot the image RGB on a figure fig = plt.figure() fig.set_size_inches([18,9]) fig.set_tight_layout(True) plt.axis('off') plt.imshow(im_RGB) - + # decide if the image if good enough for digitizing the shoreline plt.title('click if image is clear enough to digitize the shoreline.\n' + 'If not (too cloudy) click on to get another image', fontsize=14) keep_button = plt.text(0, 0.9, 'keep', size=16, ha="left", va="top", transform=plt.gca().transAxes, - bbox=dict(boxstyle="square", ec='k',fc='w')) + bbox=dict(boxstyle="square", ec='k',fc='w')) skip_button = plt.text(1, 0.9, 'skip', size=16, ha="right", va="top", transform=plt.gca().transAxes, bbox=dict(boxstyle="square", ec='k',fc='w')) - mng = plt.get_current_fig_manager() + mng = plt.get_current_fig_manager() mng.window.showMaximized() - + # let user click on the image once pt_input = ginput(n=1, timeout=1e9, show_clicks=False) pt_input = np.array(pt_input) - + # if clicks next to , show another image if pt_input[0][0] > im_ms.shape[1]/2: plt.close() continue - + else: # remove keep and skip buttons keep_button.set_visible(False) @@ -720,29 +745,29 @@ def get_reference_sl(metadata, settings): # create two new buttons add_button = plt.text(0, 0.9, 'add', size=16, ha="left", va="top", transform=plt.gca().transAxes, - bbox=dict(boxstyle="square", ec='k',fc='w')) + bbox=dict(boxstyle="square", ec='k',fc='w')) end_button = plt.text(1, 0.9, 'end', size=16, ha="right", va="top", transform=plt.gca().transAxes, - bbox=dict(boxstyle="square", ec='k',fc='w')) - + bbox=dict(boxstyle="square", ec='k',fc='w')) + # add multiple reference shorelines (until user clicks on button) pts_sl = np.expand_dims(np.array([np.nan, np.nan]),axis=0) geoms = [] while 1: add_button.set_visible(False) - end_button.set_visible(False) + end_button.set_visible(False) # update title (instructions) plt.title('Click points along the shoreline (enough points to capture the beach curvature).\n' + 'Start at one end of the beach.\n' + 'When finished digitizing, click ', - fontsize=14) + fontsize=14) plt.draw() - + # let user click on the shoreline pts = ginput(n=50000, timeout=1e9, show_clicks=True) - pts_pix = np.array(pts) + pts_pix = np.array(pts) # convert pixel coordinates to world coordinates - pts_world = SDS_tools.convert_pix2world(pts_pix[:,[1,0]], georef) - + pts_world = SDS_tools.convert_pix2world(pts_pix[:,[1,0]], georef) + # interpolate between points clicked by the user (1m resolution) pts_world_interp = np.expand_dims(np.array([np.nan, np.nan]),axis=0) for k in range(len(pts_world)-1): @@ -757,50 +782,50 @@ def get_reference_sl(metadata, settings): deltay = pts_world[k+1,1] - pts_world[k,1] phi = np.pi/2 - np.math.atan2(deltax, deltay) tf = transform.EuclideanTransform(rotation=phi, translation=pts_world[k,:]) - pts_world_interp = np.append(pts_world_interp,tf(pt_coords), axis=0) + pts_world_interp = np.append(pts_world_interp,tf(pt_coords), axis=0) pts_world_interp = np.delete(pts_world_interp,0,axis=0) - + # save as geometry (to create .geojson file later) geoms.append(geometry.LineString(pts_world_interp)) - + # convert to pixel coordinates and plot pts_pix_interp = SDS_tools.convert_world2pix(pts_world_interp, georef) pts_sl = np.append(pts_sl, pts_world_interp, axis=0) plt.plot(pts_pix_interp[:,0], pts_pix_interp[:,1], 'r--') plt.plot(pts_pix_interp[0,0], pts_pix_interp[0,1],'ko') plt.plot(pts_pix_interp[-1,0], pts_pix_interp[-1,1],'ko') - + # update title and buttons add_button.set_visible(True) - end_button.set_visible(True) + end_button.set_visible(True) plt.title('click to digitize another shoreline or to finish and save the shoreline(s)', - fontsize=14) - plt.draw() - + fontsize=14) + plt.draw() + # let the user click again ( another shoreline or ) pt_input = ginput(n=1, timeout=1e9, show_clicks=False) - pt_input = np.array(pt_input) - + pt_input = np.array(pt_input) + # if user clicks on , save the points and break the loop - if pt_input[0][0] > im_ms.shape[1]/2: + if pt_input[0][0] > im_ms.shape[1]/2: add_button.set_visible(False) - end_button.set_visible(False) + end_button.set_visible(False) plt.title('Reference shoreline saved as ' + sitename + '_reference_shoreline.pkl and ' + sitename + '_reference_shoreline.geojson') plt.draw() ginput(n=1, timeout=3, show_clicks=False) - plt.close() + plt.close() break - - pts_sl = np.delete(pts_sl,0,axis=0) + + pts_sl = np.delete(pts_sl,0,axis=0) # convert world image coordinates to user-defined coordinate system image_epsg = metadata[satname]['epsg'][i] pts_coords = SDS_tools.convert_epsg(pts_sl, image_epsg, settings['output_epsg']) - + # save the reference shoreline as .pkl filepath = os.path.join(filepath_data, sitename) with open(os.path.join(filepath, sitename + '_reference_shoreline.pkl'), 'wb') as f: pickle.dump(pts_coords, f) - + # also store as .geojson in case user wants to drag-and-drop on GIS for verification for k,line in enumerate(geoms): gdf = gpd.GeoDataFrame(geometry=gpd.GeoSeries(line)) @@ -814,11 +839,11 @@ def get_reference_sl(metadata, settings): gdf_all.crs = {'init':'epsg:'+str(image_epsg)} # convert from image_epsg to user-defined coordinate system gdf_all = gdf_all.to_crs({'init': 'epsg:'+str(settings['output_epsg'])}) - # save as geojson + # save as geojson gdf_all.to_file(os.path.join(filepath, sitename + '_reference_shoreline.geojson'), driver='GeoJSON', encoding='utf-8') - + print('Reference shoreline has been saved in ' + filepath) break - - return pts_coords \ No newline at end of file + + return pts_coords diff --git a/coastsat/SDS_shoreline.py b/coastsat/SDS_shoreline.py index d70b6c1..92383b1 100644 --- a/coastsat/SDS_shoreline.py +++ b/coastsat/SDS_shoreline.py @@ -160,7 +160,7 @@ def classify_image_NN(im_ms, im_extra, cloud_mask, min_beach_area, clf): ################################################################################################### # CONTOUR MAPPING FUNCTIONS ################################################################################################### - + def find_wl_contours1(im_ndwi, cloud_mask, im_ref_buffer): """ Traditional method for shorelien detection. @@ -315,7 +315,7 @@ def find_wl_contours2(im_ms, im_labels, cloud_mask, buffer_size, im_ref_buffer): ################################################################################################### # SHORELINE PROCESSING FUNCTIONS ################################################################################################### - + def create_shoreline_buffer(im_shape, georef, image_epsg, pixel_size, settings): """ Creates a buffer around the reference shoreline. The size of the buffer is given by @@ -339,36 +339,36 @@ def create_shoreline_buffer(im_shape, georef, image_epsg, pixel_size, settings): output spatial reference system reference_shoreline: np.array coordinates of the reference shoreline - max_dist_ref: int + max_dist_ref: int maximum distance from the reference shoreline in metres Returns: ----------- im_buffer: np.array binary image, True where the buffer is, False otherwise - """ + """ # initialise the image buffer im_buffer = np.ones(im_shape).astype(bool) if 'reference_shoreline' in settings.keys(): - + # convert reference shoreline to pixel coordinates ref_sl = settings['reference_shoreline'] ref_sl_conv = SDS_tools.convert_epsg(ref_sl, settings['output_epsg'],image_epsg)[:,:-1] ref_sl_pix = SDS_tools.convert_world2pix(ref_sl_conv, georef) ref_sl_pix_rounded = np.round(ref_sl_pix).astype(int) - + # create binary image of the reference shoreline (1 where the shoreline is 0 otherwise) im_binary = np.zeros(im_shape) for j in range(len(ref_sl_pix_rounded)): im_binary[ref_sl_pix_rounded[j,1], ref_sl_pix_rounded[j,0]] = 1 im_binary = im_binary.astype(bool) - + # dilate the binary image to create a buffer around the reference shoreline max_dist_ref_pixels = np.ceil(settings['max_dist_ref']/pixel_size) se = morphology.disk(max_dist_ref_pixels) im_buffer = morphology.binary_dilation(im_binary, se) - + return im_buffer def process_shoreline(contours, cloud_mask, georef, image_epsg, settings): @@ -396,7 +396,7 @@ def process_shoreline(contours, cloud_mask, georef, image_epsg, settings): min_length_sl: float minimum length of shoreline perimeter to be kept (in meters) - Returns: + Returns: ----------- shoreline: np.array array of points with the X and Y coordinates of the shoreline @@ -422,9 +422,9 @@ def process_shoreline(contours, cloud_mask, georef, image_epsg, settings): x_points = np.append(x_points,contours_long[k][:,0]) y_points = np.append(y_points,contours_long[k][:,1]) contours_array = np.transpose(np.array([x_points,y_points])) - + shoreline = contours_array - + # now remove any shoreline points that are attached to cloud pixels if sum(sum(cloud_mask)) > 0: # get the coordinates of the cloud pixels @@ -437,8 +437,8 @@ def process_shoreline(contours, cloud_mask, georef, image_epsg, settings): idx_keep = np.ones(len(shoreline)).astype(bool) for k in range(len(shoreline)): if np.any(np.linalg.norm(shoreline[k,:] - coords_cloud, axis=1) < 30): - idx_keep[k] = False - shoreline = shoreline[idx_keep] + idx_keep[k] = False + shoreline = shoreline[idx_keep] return shoreline @@ -471,7 +471,7 @@ def show_detection(im_ms, cloud_mask, im_labels, shoreline,image_epsg, georef, satname: string indicates the satname (L5,L7,L8 or S2) - Returns: + Returns: ----------- skip_image: boolean True if the user wants to skip the image, False otherwise. @@ -524,7 +524,7 @@ def show_detection(im_ms, cloud_mask, im_labels, shoreline,image_epsg, georef, mng = plt.get_current_fig_manager() mng.window.showMaximized() - # according to the image shape, decide whether it is better to have the images + # according to the image shape, decide whether it is better to have the images # in vertical subplots or horizontal subplots if im_RGB.shape[1] > 2*im_RGB.shape[0]: # vertical subplots @@ -581,7 +581,7 @@ def show_detection(im_ms, cloud_mask, im_labels, shoreline,image_epsg, georef, # if check_detection is True, let user manually accept/reject the images skip_image = False if settings['check_detection']: - + # set a key event to accept/reject the detections (see https://stackoverflow.com/a/15033071) # this variable needs to be immuatable so we can access it after the keypress event key_event = {} @@ -607,7 +607,7 @@ def show_detection(im_ms, cloud_mask, im_labels, shoreline,image_epsg, georef, btn_skip.remove() btn_keep.remove() btn_esc.remove() - + # keep/skip image according to the pressed key, 'escape' to break the loop if key_event.get('pressed') == 'right': skip_image = False @@ -623,7 +623,7 @@ def show_detection(im_ms, cloud_mask, im_labels, shoreline,image_epsg, georef, # if save_figure is True, save a .jpg under /jpg_files/detection if settings['save_figure'] and not skip_image: - fig.savefig(os.path.join(filepath, date + '_' + satname + '.jpg'), dpi=200) + fig.savefig(os.path.join(filepath, date + '_' + satname + '.jpg'), dpi=150) # Don't close the figure window, but remove all axes and settings, ready for next plot for ax in fig.axes: @@ -671,7 +671,7 @@ def extract_shorelines(metadata, settings): sitename = settings['inputs']['sitename'] filepath_data = settings['inputs']['filepath'] # initialise output structure - output = dict([]) + output = dict([]) # create a subfolder to store the .jpg images showing the detection filepath_jpg = os.path.join(filepath_data, sitename, 'jpg_files', 'detection') if not os.path.exists(filepath_jpg): @@ -698,20 +698,22 @@ def extract_shorelines(metadata, settings): # load classifiers and if satname in ['L5','L7','L8']: - pixel_size = 15 - if settings['dark_sand']: + pixel_size = 15 + if settings['sand_color'] == 'dark': clf = joblib.load(os.path.join(os.getcwd(), 'classifiers', 'NN_4classes_Landsat_dark.pkl')) + elif settings['sand_color'] == 'bright': + clf = joblib.load(os.path.join(os.getcwd(), 'classifiers', 'NN_4classes_Landsat_bright.pkl')) else: clf = joblib.load(os.path.join(os.getcwd(), 'classifiers', 'NN_4classes_Landsat.pkl')) - + elif satname == 'S2': pixel_size = 10 clf = joblib.load(os.path.join(os.getcwd(), 'classifiers', 'NN_4classes_S2.pkl')) - - # convert settings['min_beach_area'] and settings['buffer_size'] from metres to pixels + + # convert settings['min_beach_area'] and settings['buffer_size'] from metres to pixels buffer_size_pixels = np.ceil(settings['buffer_size']/pixel_size) min_beach_area_pixels = np.ceil(settings['min_beach_area']/pixel_size**2) - + # loop through the images for i in range(len(filenames)): @@ -723,21 +725,33 @@ def extract_shorelines(metadata, settings): im_ms, georef, cloud_mask, im_extra, im_QA, im_nodata = SDS_preprocess.preprocess_single(fn, satname, settings['cloud_mask_issue']) # get image spatial reference system (epsg code) from metadata dict image_epsg = metadata[satname]['epsg'][i] + # define an advanced cloud mask (for L7 it takes into account the fact that diagonal + # bands of no data are not clouds) + if not satname == 'L7' or sum(sum(im_nodata)) == 0 or sum(sum(im_nodata)) > 0.5*im_nodata.size: + cloud_mask_adv = cloud_mask + else: + cloud_mask_adv = np.logical_xor(cloud_mask, im_nodata) + # calculate cloud cover - cloud_cover = np.divide(sum(sum(cloud_mask.astype(int))), + cloud_cover = np.divide(sum(sum(cloud_mask_adv.astype(int))), (cloud_mask.shape[0]*cloud_mask.shape[1])) # skip image if cloud cover is above threshold if cloud_cover > settings['cloud_thresh']: continue - # classify image in 4 classes (sand, whitewater, water, other) with NN classifier - im_classif, im_labels = classify_image_NN(im_ms, im_extra, cloud_mask, - min_beach_area_pixels, clf) - # calculate a buffer around the reference shoreline (if any has been digitised) im_ref_buffer = create_shoreline_buffer(cloud_mask.shape, georef, image_epsg, pixel_size, settings) + # when running the automated mode, skip image if cloudy pixels are found in the shoreline buffer + if not settings['check_detection'] and 'reference_shoreline' in settings.keys(): + if sum(sum(np.logical_and(im_ref_buffer, cloud_mask_adv))) > 0: + continue + + # classify image in 4 classes (sand, whitewater, water, other) with NN classifier + im_classif, im_labels = classify_image_NN(im_ms, im_extra, cloud_mask, + min_beach_area_pixels, clf) + # there are two options to map the contours: # if there are pixels in the 'sand' class --> use find_wl_contours2 (enhanced) # otherwise use find_wl_contours2 (traditional) @@ -756,7 +770,7 @@ def extract_shorelines(metadata, settings): continue # process the water contours into a shoreline - shoreline = process_shoreline(contours_mwi, cloud_mask, georef, image_epsg, settings) + shoreline = process_shoreline(contours_mwi, cloud_mask, georef, image_epsg, settings) # visualise the mapped shorelines, there are two options: # if settings['check_detection'] = True, shows the detection to the user for accept/reject @@ -804,7 +818,7 @@ def extract_shorelines(metadata, settings): gdf = SDS_tools.output_to_gdf(output) # set projection gdf.crs = {'init':'epsg:'+str(settings['output_epsg'])} - # save as geojson + # save as geojson gdf.to_file(os.path.join(filepath, sitename + '_output.geojson'), driver='GeoJSON', encoding='utf-8') return output diff --git a/coastsat/SDS_tools.py b/coastsat/SDS_tools.py index b2095da..7e86598 100644 --- a/coastsat/SDS_tools.py +++ b/coastsat/SDS_tools.py @@ -464,23 +464,26 @@ def output_to_gdf(output): """ # loop through the mapped shorelines + counter = 0 for i in range(len(output['shorelines'])): # skip if there shoreline is empty if len(output['shorelines'][i]) == 0: continue - # save the geometry + attributes - geom = geometry.LineString(output['shorelines'][i]) - gdf = gpd.GeoDataFrame(geometry=gpd.GeoSeries(geom)) - gdf.index = [i] - gdf.loc[i,'date'] = output['dates'][i].strftime('%Y-%m-%d %H:%M:%S') - gdf.loc[i,'satname'] = output['satname'][i] - gdf.loc[i,'geoaccuracy'] = output['geoaccuracy'][i] - gdf.loc[i,'cloud_cover'] = output['cloud_cover'][i] - # store into geodataframe - if i == 0: - gdf_all = gdf else: - gdf_all = gdf_all.append(gdf) + # save the geometry + attributes + geom = geometry.LineString(output['shorelines'][i]) + gdf = gpd.GeoDataFrame(geometry=gpd.GeoSeries(geom)) + gdf.index = [i] + gdf.loc[i,'date'] = output['dates'][i].strftime('%Y-%m-%d %H:%M:%S') + gdf.loc[i,'satname'] = output['satname'][i] + gdf.loc[i,'geoaccuracy'] = output['geoaccuracy'][i] + gdf.loc[i,'cloud_cover'] = output['cloud_cover'][i] + # store into geodataframe + if counter == 0: + gdf_all = gdf + else: + gdf_all = gdf_all.append(gdf) + counter = counter + 1 return gdf_all diff --git a/example.py b/example.py index 6b0c393..e632050 100644 --- a/example.py +++ b/example.py @@ -71,7 +71,7 @@ settings = { 'buffer_size': 150, # radius (in metres) of the buffer around sandy pixels considered in the shoreline detection 'min_length_sl': 200, # minimum length (in metres) of shoreline perimeter to be valid 'cloud_mask_issue': False, # switch this parameter to True if sand pixels are masked (in black) on many images - 'dark_sand': False, # only switch to True if your site has dark sand (e.g. black sand beach) + 'sand_color': 'default', # 'default', 'dark' (for grey/black sand beaches) or 'bright' (for white sand beaches) } # [OPTIONAL] preprocess images (cloud masking, pansharpening/down-sampling) diff --git a/example_jupyter.ipynb b/example_jupyter.ipynb index d2abc21..07a292c 100644 --- a/example_jupyter.ipynb +++ b/example_jupyter.ipynb @@ -6,14 +6,16 @@ "source": [ "# *CoastSat*: example at Narrabeen-Collaroy, Australia\n", "\n", - "This software is described in *Vos K., Splinter K.D., Harley M.D., Simmons J.A., Turner I.L. (submitted). CoastSat: a Google Earth Engine-enabled software to extract shorelines from publicly available satellite imagery, Environmental Modelling and Software*. It enables the users to extract time-series of shoreline change over the last 30+ years at their site of interest.\n", + "This software is described in details in:\n", + "* Vos K., Splinter K.D., Harley M.D., Simmons J.A., Turner I.L. (2019). CoastSat: a Google Earth Engine-enabled Python toolkit to extract shorelines from publicly available satellite imagery. Environmental Modelling and Software. 122, 104528. https://doi.org/10.1016/j.envsoft.2019.104528\n", "\n", + "It enables the users to extract time-series of shoreline change over the last 30+ years at their site of interest.\n", "There are three main steps:\n", - "- retrieval of the satellite images of the region of interest from Google Earth Engine\n", - "- extraction of the shorelines from the images using a sub-pixel resolution technique\n", - "- intersection of the 2D shorelines with shore-normal transects\n", + "1. Retrieval of the satellite images of the region of interest from Google Earth Engine\n", + "2. Shoreline extraction at sub-pixel resolution\n", + "3. Intersection of the shorelines with cross-shore transects\n", "\n", - "## 1. Initial settings\n", + "## Initial settings\n", "\n", "Refer to the **Installation** section of the README for instructions on how to install the Python packages necessary to run the software, including Google Earth Engine Python API. If that step has been completed correctly, the following packages should be imported without any problem." ] @@ -37,7 +39,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 2. Retrieval of the images from GEE\n", + "## 1. Retrieval of the images from GEE\n", "\n", "Define the region of interest (`polygon`), the date range (`dates`) and the satellite missions (`sat_list`) from which you wish to retrieve the satellite images. The images will be cropped on the Google Earth Engine server and only the region of interest will be downloaded as a .tif file. The files will stored in the directory defined in `filepath`.\n", "\n", @@ -104,7 +106,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 3. Shoreline extraction\n", + "## 2. Shoreline extraction\n", "\n", "This section maps the position of the shoreline on the satellite images. The user can define the cloud threhold (`cloud_thresh`) and select the spatial reference system in which to output the coordinates of the mapped shorelines (`output_epsg`). See http://spatialreference.org/ to find the EPSG number corresponding to your local coordinate system. Make sure that your are using cartesian coordinates and not spherical coordinates (lat,lon) like WGS84. \n", "\n", @@ -133,7 +135,7 @@ " 'buffer_size': 150, # radius (in metres) of the buffer around sandy pixels considered in the shoreline detection\n", " 'min_length_sl': 200, # minimum length (in metres) of shoreline perimeter to be valid\n", " 'cloud_mask_issue': False, # switch this parameter to True if sand pixels are masked (in black) on many images \n", - " 'dark_sand': False, # only switch to True if your site has dark sand (e.g. black sand beach)\n", + " 'sand_color': 'default', # 'default', 'dark' (for grey/black sand beaches) or 'bright' (for white sand beaches)\n", "}" ] }, @@ -178,7 +180,9 @@ "metadata": {}, "source": [ "### Batch shoreline detection\n", - "Extracts the 2D shorelines from the images in the spatial reference system specified by the user in `'output_epsg'`. The mapped shorelines are saved into `output.pkl` (under *./data/sitename*) and `output.geojson` to use in GIS softwares." + "Extracts the 2D shorelines from the images in the spatial reference system specified by the user in `'output_epsg'`. The mapped shorelines are saved into `output.pkl` (under *./data/sitename*) and `output.geojson` (to be used in a GIS software).\n", + "\n", + "If you see that the sand pixels on the images are not being identified, change the parameter `sand_color` from `default` to `dark` or `bright` depending on the color of your beach. " ] }, { @@ -225,7 +229,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## 4. Shoreline analysis\n", + "## 3. Shoreline analysis\n", "\n", "In this section we show how to compute time-series of cross-shore distance along user-defined shore-normal transects." ] @@ -371,7 +375,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.7.3" + "version": "3.6.7" }, "varInspector": { "cols": { diff --git a/examples/doc/batch_detection.gif b/examples/doc/batch_detection.gif deleted file mode 100644 index 10a4534..0000000 Binary files a/examples/doc/batch_detection.gif and /dev/null differ