update from Github

master
Kilian Vos 5 years ago
parent 0939021f3c
commit b32d7d1239

@ -1,26 +1,27 @@
# CoastSat
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.2779294.svg)](https://doi.org/10.5281/zenodo.2779294)
CoastSat is an open-source software toolkit written in Python that enables users to obtain time-series of shoreline position at any coastline worldwide from 30+ years (and growing) of publicly available satellite imagery.
![Alt text](https://github.com/kvos/CoastSat/blob/development/examples/doc/example.gif)
The underlying approach and application of the CoastSat toolkit are described in detail in:
The underlying approach of the CoastSat toolkit are described in detail in:
* Vos K., Splinter K.D., Harley M.D., Simmons J.A., Turner I.L. (2019). CoastSat: a Google Earth Engine-enabled Python toolkit to extract shorelines from publicly available satellite imagery. *Environmental Modelling and Software*. 122, 104528. https://doi.org/10.1016/j.envsoft.2019.104528
*Vos K., Splinter K.D., Harley M.D., Simmons J.A., Turner I.L. (submitted). CoastSat: a Google Earth Engine-enabled Python toolkit to extract shorelines from publicly available satellite imagery, Environmental Modelling and Software*.
Example applications and accuracy of the resulting satellite-derived shorelines are discussed in:
* Vos K., Harley M.D., Splinter K.D., Simmons J.A., Turner I.L. (2019). Sub-annual to multi-decadal shoreline variability from publicly available satellite imagery. *Coastal Engineering*. 150, 160174. https://doi.org/10.1016/j.coastaleng.2019.04.004
### Description
There are three main steps:
Satellite remote sensing can provide low-cost long-term shoreline data capable of resolving the temporal scales of interest to coastal scientists and engineers at sites where no in-situ field measurements are available. CoastSat enables the non-expert user to extract shorelines from Landsat 5, Landsat 7, Landsat 8 and Sentinel-2 images.
The shoreline detection algorithm implemented in CoastSat is optimised for sandy beach coastlines. It combines a sub-pixel border segmentation and an image classification component, which refines the segmentation into four distinct categories such that the shoreline detection is specific to the sand/water interface.
The toolbox has three main functionalities:
- assisted retrieval from Google Earth Engine of all available satellite images spanning the user-defined region of interest and time period
- automated extraction of shorelines from all the selected images using a sub-pixel resolution technique
- intersection of the 2D shorelines with user-defined shore-normal transects
### Description
Satellite remote sensing can provide low-cost long-term shoreline data capable of resolving the temporal scales of interest to coastal scientists and engineers at sites where no in-situ field measurements are available. Satellite imagery spanning the last 30 years with constant revisit periods is publicly available and suitable to extract repeated measurements of the shoreline position.
CoastSat enables the non-expert user to extract shorelines from Landsat 5, Landsat 7, Landsat 8 and Sentinel-2 images.
The shoreline detection algorithm implemented in CoastSat is optimised for sandy beach coastlines. It combines a sub-pixel border segmentation and an image classification component, which refines the segmentation into four distinct categories such that the shoreline detection is specific to the sand/water interface.
**If you like the repo put a star on it!**
## 1. Installation
@ -44,21 +45,17 @@ conda activate coastsat
To confirm that you have successfully activated CoastSat, your terminal command line prompt should now start with (coastsat).
**In case errors are raised:**: you should create a new environment and manually install the required packages, which are listed in the environment.yml file. The following [link](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) shows how to create and manage an environment with Anaconda.
**In case errors are raised:**: open the **Anaconda Navigator**, in the *Environments* tab click on *Import* and select the *environment.yml* file. For more details, the following [link](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) shows how to create and manage an environment with Anaconda.
### 1.2 Activate Google Earth Engine Python API
Go to https://earthengine.google.com and sign up to Google Earth Engine (GEE).
![gee_capture](https://user-images.githubusercontent.com/7217258/49348457-a9271300-f6f9-11e8-8c0b-407383940e94.jpg)
Once you have created a Google Earth Engine account, go back to the Anaconda prompt and link your GEE credentials to the Python API:
With the `coastsat` environment activated, run the following command on the Anaconda Prompt to link your environment to the GEE server:
```
earthengine authenticate
```
A web browser will open, login with your GEE credential and accept the terms and conditions. Then copy the authorization code into the Anaconda terminal.
A web browser will open, login with a gmail account and accept the terms and conditions. Then copy the authorization code into the Anaconda terminal.
Now you are ready to start using the CoastSat toolbox!
@ -78,9 +75,9 @@ The following sections guide the reader through the different functionalities of
If using `example.py` on **Spyder**, make sure that the Graphics Backend is set to **Automatic** and not **Inline** (as this mode doesn't allow to interact with the figures). To change this setting go under Preferences>IPython console>Graphics.
To run a Jupyter Notebook, place your cursor inside one of the code sections and then click on the `run` button up in the top menu to run that section and progress forward (as shown in the animation below).
A Jupyter Notebook combines formatted text and code. To run the code, place your cursor inside one of the code sections and click on the `run cell` button (shown below) and progress forward.
![example_jupyter](https://user-images.githubusercontent.com/7217258/49705486-8dc88480-fc72-11e8-8300-c342baaf54eb.gif)
![run_cell](https://user-images.githubusercontent.com/7217258/60766570-c2100080-a0ee-11e9-9675-e2aeba87e4a7.png)
### 2.1 Retrieval of the satellite images
@ -104,24 +101,22 @@ The screenshot below shows an example of inputs that will retrieve all the image
To map the shorelines, the following user-defined settings are needed:
- `cloud_thresh`: threshold on maximum cloud cover that is acceptable on the images (value between 0 and 1 - this may require some initial experimentation).
- `output_epsg`: epsg code defining the spatial reference system of the shoreline coordinates. It has to be a cartesian coordinate system (i.e. projected) and not a geographical coordinate system (in latitude and longitude angles). See http://spatialreference.org/ to find the EPSG number corresponding to your local coordinate system.
- `check_detection`: if set to `True` the user can quality control each shoreline detection interactively.
- `check_detection`: if set to `True` the user can quality control each shoreline detection interactively (recommended when mapping shorelines for the first time).
- `save_figure`: if set to `True` a figure of each mapped shoreline is saved (under *filepath/sitename/jpg_files/detection*). Note that this may slow down the process.
The setting `check_detection` is recommended when using the tool for the first time as it will show the user how CoastSat is mapping the shorelines.
There are additional parameters (`min_beach_size`, `buffer_size`, `min_length_sl`, `cloud_mask_issue` and `dark sand`) that can be tuned to optimise the shoreline detection (for Advanced users only). For the moment leave these parameters set to their default values, we will see later how they can be modified.
There are additional parameters (`min_beach_size`, `buffer_size`, `min_length_sl`, `cloud_mask_issue` and `sand_color`) that can be tuned to optimise the shoreline detection (for Advanced users only). For the moment leave these parameters set to their default values, we will see later how they can be modified.
An example of settings is provided here:
![doc2](https://user-images.githubusercontent.com/7217258/56278918-7a5e8600-614a-11e9-9184-77b69427b834.PNG)
![settings](https://user-images.githubusercontent.com/7217258/65950715-f68f2080-e481-11e9-80b6-19e13f2ec179.PNG)
Once all the settings have been defined, the batch shoreline detection can be launched by calling:
```
output = SDS_shoreline.extract_shorelines(metadata, settings)
```
When `check_detection` is set to `True`, a figure like the one below appears and asks the user to manually accept/reject each detection by pressing the `right arrow` (⇨) to `keep` the shoreline or `left arrow` (⇦) to `skip` the mapped shoreline. The user can break the loop at any time by pressing `escape` (nothing will be saved though).
When `check_detection` is set to `True`, a figure like the one below appears and asks the user to manually accept/reject each detection by pressing **on the keyboard** the `right arrow` (⇨) to `keep` the shoreline or `left arrow` (⇦) to `skip` the mapped shoreline. The user can break the loop at any time by pressing `escape` (nothing will be saved though).
![Alt text](https://github.com/kvos/CoastSat/blob/development/examples/doc/batch_detection.gif)
![map_shorelines](https://user-images.githubusercontent.com/7217258/60766769-fafda480-a0f1-11e9-8f91-419d848ff98d.gif)
Once all the shorelines have been mapped, the output is available in two different formats (saved under *.\data\sitename*):
- `sitename_output.pkl`: contains a list with the shoreline coordinates, the exact timestamp at which the image was captured (UTC time), the geometric accuracy and the cloud cover of each individual image. This list can be manipulated with Python, a snippet of code to plot the results is provided in the example script.
@ -133,7 +128,7 @@ The figure below shows how the satellite-derived shorelines can be opened in a G
#### Reference shoreline
There is also an option to manually digitize a reference shoreline before running the batch shoreline detection on all the images. This reference shoreline helps to reject outliers and false detections when mapping shorelines as it only considers as valid shorelines the points that are within a distance from this reference shoreline.
Before running the batch shoreline detection, there is the option to manually digitize a reference shoreline on one cloud-free image. This reference shoreline helps to reject outliers and false detections when mapping shorelines as it only considers as valid shorelines the points that are within a defined distance from this reference shoreline.
The user can manually digitize a reference shoreline on one of the images by calling:
```
@ -142,9 +137,9 @@ settings['max_dist_ref'] = 100 # max distance (in meters) allowed from the refer
```
This function allows the user to click points along the shoreline on one of the satellite images, as shown in the animation below.
![ref_shoreline](https://user-images.githubusercontent.com/7217258/49710753-94b1c000-fc8f-11e8-9b6c-b5e96aadc5c9.gif)
![reference_shoreline](https://user-images.githubusercontent.com/7217258/60766913-6c8a2280-a0f3-11e9-89e5-865e11aa26cd.gif)
The maximum distance (in metres) allowed from the reference shoreline is defined by the parameter `max_dist_ref`. This parameter is set to a default value of 100 m. If you think that 100m buffer from the reference shoreline will not capture the shoreline variability at your site, increase the value of this parameter. This may be the case for large nourishments or eroding/accreting coastlines.
The maximum distance (in metres) allowed from the reference shoreline is defined by the parameter `max_dist_ref`. This parameter is set to a default value of 100 m. If you think that 100 m buffer from the reference shoreline will not capture the shoreline variability at your site, increase the value of this parameter. This may be the case for large nourishments or eroding/accreting coastlines.
#### Advanced shoreline detection parameters
@ -153,7 +148,7 @@ As mentioned above, there are some additional parameters that can be modified to
- `buffer_size`: radius (in metres) that defines the buffer around sandy pixels that is considered to calculate the sand/water threshold. The default value of `buffer_size` is 150 m. This parameter should be increased if you have a very wide (>150 m) surf zone or inter-tidal zone.
- `min_length_sl`: minimum length (in metres) of shoreline perimeter to be valid. This can be used to discard small features that are detected but do not correspond to the actual shoreline. The default value is 200 m. If the shoreline that you are trying to map is shorter than 200 m, decrease the value of this parameter.
- `cloud_mask_issue`: the cloud mask algorithm applied to Landsat images by USGS, namely CFMASK, does have difficulties sometimes with very bright features such as beaches or white-water in the ocean. This may result in pixels corresponding to a beach being identified as clouds and appear as masked pixels on your images. If this issue seems to be present in a large proportion of images from your local beach, you can switch this parameter to `True` and CoastSat will remove from the cloud mask the pixels that form very thin linear features, as often these are beaches and not clouds. Only activate this parameter if you observe this very specific cloud mask issue, otherwise leave to the default value of `False`.
- `dark_sand`: if your beach has dark sand (grey/black sand beaches), you can set this parameter to `True` and the classifier will be able to pick up the dark sand. At this stage this option is only available for Landsat images (soon for Sentinel-2 as well).
- `sand_color`: this parameter can take 3 values: `default`, `dark` or `bright`. Only change this parameter if you are seing that with the `default` the sand pixels are not being classified as sand (in orange). If your beach has dark sand (grey/black sand beaches), you can set this parameter to `dark` and the classifier will be able to pick up the dark sand. On the other hand, if your beach has white sand and the `default` classifier is not picking it up, switch this parameter to `bright`. At this stage this option is only available for Landsat images (soon for Sentinel-2 as well).
### 2.3 Shoreline change analysis
@ -191,12 +186,17 @@ An example is shown in the animation below:
Having a problem? Post an issue in the [Issues page](https://github.com/kvos/coastsat/issues) (please do not email).
## Contributing
If you are willing to contribute, check out our todo list in the [Projects page](https://github.com/kvos/CoastSat/projects/1).
1. Fork the repository (https://github.com/kvos/coastsat/fork).
A fork is a copy on which you can make your changes.
2. Create a new branch on your fork
3. Commit your changes and push them to your branch
4. When the branch is ready to be merged, create a Pull Request
4. When the branch is ready to be merged, create a Pull Request (how to make a clean pull request explained [here](https://gist.github.com/MarcDiethelm/7303312))
## References
1. Vos K., Harley M.D., Splinter K.D., Simmons J.A., Turner I.L. (2019). Sub-annual to multi-decadal shoreline variability from publicly available satellite imagery. *Coastal Engineering*. 150, 160174. https://doi.org/10.1016/j.coastaleng.2019.04.004
Check the following link for more information on how to make a clean pull request: https://gist.github.com/MarcDiethelm/7303312).
2. Vos K., Splinter K.D.,Harley M.D., Simmons J.A., Turner I.L. (2019). CoastSat: a Google Earth Engine-enabled Python toolkit to extract shorelines from publicly available satellite imagery. *Environmental Modelling and Software*. 122, 104528. https://doi.org/10.1016/j.envsoft.2019.104528
If you like the repo put a star on it!
3. Training dataset used for pixel classification in CoastSat: https://doi.org/10.5281/zenodo.3334147

@ -29,9 +29,6 @@ from coastsat import SDS_preprocess, SDS_tools
np.seterr(all='ignore') # raise/ignore divisions by 0 and nans
# initialise connection with GEE server
ee.Initialize()
def download_tif(image, polygon, bandsId, filepath):
"""
Downloads a .TIF image from the ee server and stores it in a temp file
@ -99,6 +96,9 @@ def retrieve_images(inputs):
"""
# initialise connection with GEE server
ee.Initialize()
# read inputs dictionnary
sitename = inputs['sitename']
polygon = inputs['polygon']
@ -131,14 +131,21 @@ def retrieve_images(inputs):
if not os.path.exists(filepath):
os.makedirs(filepath)
if not os.path.exists(filepath_meta):
os.makedirs(filepath_meta)
os.makedirs(filepath_meta)
# Landsat 5 collection
input_col = ee.ImageCollection('LANDSAT/LT05/C01/T1_TOA')
# filter by location and dates
flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1])
# get all images in the filtered collection
im_all = flt_col.getInfo().get('features')
count_loop = 0
while count_loop < 1:
try:
input_col = ee.ImageCollection('LANDSAT/LT05/C01/T1_TOA')
# filter by location and dates
flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1])
# get all images in the filtered collection
im_all = flt_col.getInfo().get('features')
count_loop = 1
except:
count_loop = 0
# remove very cloudy images (>95% cloud)
cloud_cover = [_['properties']['CLOUD_COVER'] for _ in im_all]
if np.any([_ > 95 for _ in cloud_cover]):
@ -157,9 +164,14 @@ def retrieve_images(inputs):
all_names = []
im_epsg = []
for i in range(n_img):
# find each image in ee database
im = ee.Image(im_col[i]['id'])
count_loop = 0
while count_loop < 1:
try:
# find each image in ee database
im = ee.Image(im_col[i]['id'])
count_loop = 1
except:
count_loop = 0
# read metadata
im_dic = im_col[i]
# get bands
@ -189,7 +201,13 @@ def retrieve_images(inputs):
all_names.append(filename)
filenames.append(filename)
# download .TIF image
local_data = download_tif(im, polygon, ms_bands, filepath)
count_loop = 0
while count_loop < 1:
try:
local_data = download_tif(im, polygon, ms_bands, filepath)
count_loop = 1
except:
count_loop = 0
# update filename
try:
os.rename(local_data, os.path.join(filepath, filename))
@ -237,11 +255,18 @@ def retrieve_images(inputs):
os.makedirs(filepath_meta)
# landsat 7 collection
input_col = ee.ImageCollection('LANDSAT/LE07/C01/T1_RT_TOA')
# filter by location and dates
flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1])
# get all images in the filtered collection
im_all = flt_col.getInfo().get('features')
count_loop = 0
while count_loop < 1:
try:
input_col = ee.ImageCollection('LANDSAT/LE07/C01/T1_RT_TOA')
# filter by location and dates
flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1])
# get all images in the filtered collection
im_all = flt_col.getInfo().get('features')
count_loop = 1
except:
count_loop = 0
# remove very cloudy images (>95% cloud)
cloud_cover = [_['properties']['CLOUD_COVER'] for _ in im_all]
if np.any([_ > 95 for _ in cloud_cover]):
@ -261,8 +286,14 @@ def retrieve_images(inputs):
im_epsg = []
for i in range(n_img):
# find each image in ee database
im = ee.Image(im_col[i]['id'])
count_loop = 0
while count_loop < 1:
try:
# find each image in ee database
im = ee.Image(im_col[i]['id'])
count_loop = 1
except:
count_loop = 0
# read metadata
im_dic = im_col[i]
# get bands
@ -295,8 +326,14 @@ def retrieve_images(inputs):
all_names.append(filename_pan)
filenames.append(filename_pan)
# download .TIF image
local_data_pan = download_tif(im, polygon, pan_band, filepath_pan)
local_data_ms = download_tif(im, polygon, ms_bands, filepath_ms)
count_loop = 0
while count_loop < 1:
try:
local_data_pan = download_tif(im, polygon, pan_band, filepath_pan)
local_data_ms = download_tif(im, polygon, ms_bands, filepath_ms)
count_loop = 1
except:
count_loop = 0
# update filename
try:
os.rename(local_data_pan, os.path.join(filepath_pan, filename_pan))
@ -349,11 +386,18 @@ def retrieve_images(inputs):
os.makedirs(filepath_meta)
# landsat 8 collection
input_col = ee.ImageCollection('LANDSAT/LC08/C01/T1_RT_TOA')
# filter by location and dates
flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1])
# get all images in the filtered collection
im_all = flt_col.getInfo().get('features')
count_loop = 0
while count_loop < 1:
try:
input_col = ee.ImageCollection('LANDSAT/LC08/C01/T1_RT_TOA')
# filter by location and dates
flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1])
# get all images in the filtered collection
im_all = flt_col.getInfo().get('features')
count_loop = 1
except:
count_loop = 0
# remove very cloudy images (>95% cloud)
cloud_cover = [_['properties']['CLOUD_COVER'] for _ in im_all]
if np.any([_ > 95 for _ in cloud_cover]):
@ -373,8 +417,14 @@ def retrieve_images(inputs):
im_epsg = []
for i in range(n_img):
# find each image in ee database
im = ee.Image(im_col[i]['id'])
count_loop = 0
while count_loop < 1:
try:
# find each image in ee database
im = ee.Image(im_col[i]['id'])
count_loop = 1
except:
count_loop = 0
# read metadata
im_dic = im_col[i]
# get bands
@ -407,8 +457,15 @@ def retrieve_images(inputs):
all_names.append(filename_pan)
filenames.append(filename_pan)
# download .TIF image
local_data_pan = download_tif(im, polygon, pan_band, filepath_pan)
local_data_ms = download_tif(im, polygon, ms_bands, filepath_ms)
count_loop = 0
while count_loop < 1:
try:
local_data_pan = download_tif(im, polygon, pan_band, filepath_pan)
local_data_ms = download_tif(im, polygon, ms_bands, filepath_ms)
count_loop = 1
except:
count_loop = 0
# update filename
try:
os.rename(local_data_pan, os.path.join(filepath_pan, filename_pan))
@ -461,11 +518,18 @@ def retrieve_images(inputs):
os.makedirs(filepath_meta)
# Sentinel2 collection
input_col = ee.ImageCollection('COPERNICUS/S2')
# filter by location and dates
flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1])
# get all images in the filtered collection
im_all = flt_col.getInfo().get('features')
count_loop = 0
while count_loop < 1:
try:
input_col = ee.ImageCollection('COPERNICUS/S2')
# filter by location and dates
flt_col = input_col.filterBounds(ee.Geometry.Polygon(polygon)).filterDate(dates[0],dates[1])
# get all images in the filtered collection
im_all = flt_col.getInfo().get('features')
count_loop = 1
except:
count_loop = 0
# remove duplicates in the collection (there are many in S2 collection)
timestamps = [datetime.fromtimestamp(_['properties']['system:time_start']/1000,
tz=pytz.utc) for _ in im_all]
@ -516,9 +580,14 @@ def retrieve_images(inputs):
all_names = []
im_epsg = []
for i in range(n_img):
# find each image in ee database
im = ee.Image(im_col[i]['id'])
count_loop = 0
while count_loop < 1:
try:
# find each image in ee database
im = ee.Image(im_col[i]['id'])
count_loop = 1
except:
count_loop = 0
# read metadata
im_dic = im_col[i]
# get bands
@ -547,21 +616,42 @@ def retrieve_images(inputs):
filenames.append(filename10)
# download .TIF image and update filename
local_data = download_tif(im, polygon, bands10, os.path.join(filepath, '10m'))
count_loop = 0
while count_loop < 1:
try:
local_data = download_tif(im, polygon, bands10, os.path.join(filepath, '10m'))
count_loop = 1
except:
count_loop = 0
try:
os.rename(local_data, os.path.join(filepath, '10m', filename10))
except:
os.remove(os.path.join(filepath, '10m', filename10))
os.rename(local_data, os.path.join(filepath, '10m', filename10))
local_data = download_tif(im, polygon, bands20, os.path.join(filepath, '20m'))
count_loop = 0
while count_loop < 1:
try:
local_data = download_tif(im, polygon, bands20, os.path.join(filepath, '20m'))
count_loop = 1
except:
count_loop = 0
try:
os.rename(local_data, os.path.join(filepath, '20m', filename20))
except:
os.remove(os.path.join(filepath, '20m', filename20))
os.rename(local_data, os.path.join(filepath, '20m', filename20))
local_data = download_tif(im, polygon, bands60, os.path.join(filepath, '60m'))
count_loop = 0
while count_loop < 1:
try:
local_data = download_tif(im, polygon, bands60, os.path.join(filepath, '60m'))
count_loop = 1
except:
count_loop = 0
try:
os.rename(local_data, os.path.join(filepath, '60m', filename60))
except:

@ -1,7 +1,7 @@
"""This module contains all the functions needed to preprocess the satellite images before the
shoreline can be extracted. This includes creating a cloud mask and
pansharpening/downsampling the multispectral bands.
shoreline can be extracted. This includes creating a cloud mask and
pansharpening/downsampling the multispectral bands.
Author: Kilian Vos, Water Research Laboratory, University of New South Wales
"""
@ -32,9 +32,9 @@ np.seterr(all='ignore') # raise/ignore divisions by 0 and nans
def create_cloud_mask(im_QA, satname, cloud_mask_issue):
"""
Creates a cloud mask using the information contained in the QA band.
KV WRL 2018
Arguments:
-----------
im_QA: np.array
@ -43,13 +43,13 @@ def create_cloud_mask(im_QA, satname, cloud_mask_issue):
short name for the satellite (L5, L7, L8 or S2)
cloud_mask_issue: boolean
True if there is an issue with the cloud mask and sand pixels are being masked on the images
Returns:
-----------
cloud_mask : np.array
A boolean array with True if a pixel is cloudy and False otherwise
"""
# convert QA bits (the bits allocated to cloud cover vary depending on the satellite mission)
if satname == 'L8':
cloud_values = [2800, 2804, 2808, 2812, 6896, 6900, 6904, 6908]
@ -57,26 +57,26 @@ def create_cloud_mask(im_QA, satname, cloud_mask_issue):
cloud_values = [752, 756, 760, 764]
elif satname == 'S2':
cloud_values = [1024, 2048] # 1024 = dense cloud, 2048 = cirrus clouds
# find which pixels have bits corresponding to cloud values
cloud_mask = np.isin(im_QA, cloud_values)
# remove cloud pixels that form very thin features. These are beach or swash pixels that are
# remove cloud pixels that form very thin features. These are beach or swash pixels that are
# erroneously identified as clouds by the CFMASK algorithm applied to the images by the USGS.
if sum(sum(cloud_mask)) > 0 and sum(sum(~cloud_mask)) > 0:
morphology.remove_small_objects(cloud_mask, min_size=10, connectivity=1, in_place=True)
if cloud_mask_issue:
elem = morphology.square(3) # use a square of width 3 pixels
elem = morphology.square(3) # use a square of width 3 pixels
cloud_mask = morphology.binary_opening(cloud_mask,elem) # perform image opening
# remove objects with less than 25 connected pixels
morphology.remove_small_objects(cloud_mask, min_size=25, connectivity=1, in_place=True)
return cloud_mask
def hist_match(source, template):
"""
Adjust the pixel values of a grayscale image such that its histogram matches that of a
Adjust the pixel values of a grayscale image such that its histogram matches that of a
target image.
Arguments:
@ -118,11 +118,11 @@ def hist_match(source, template):
def pansharpen(im_ms, im_pan, cloud_mask):
"""
Pansharpens a multispectral image, using the panchromatic band and a cloud mask.
Pansharpens a multispectral image, using the panchromatic band and a cloud mask.
A PCA is applied to the image, then the 1st PC is replaced with the panchromatic band.
Note that it is essential to match the histrograms of the 1st PC and the panchromatic band
before replacing and inverting the PCA.
KV WRL 2018
Arguments:
@ -133,13 +133,13 @@ def pansharpen(im_ms, im_pan, cloud_mask):
Panchromatic band (2D)
cloud_mask: np.array
2D cloud mask with True where cloud pixels are
Returns:
-----------
im_ms_ps: np.ndarray
Pansharpened multispectral image (3D)
"""
# reshape image into vector and apply cloud mask
vec = im_ms.reshape(im_ms.shape[0] * im_ms.shape[1], im_ms.shape[2])
vec_mask = cloud_mask.reshape(im_ms.shape[0] * im_ms.shape[1])
@ -147,13 +147,13 @@ def pansharpen(im_ms, im_pan, cloud_mask):
# apply PCA to multispectral bands
pca = decomposition.PCA()
vec_pcs = pca.fit_transform(vec)
# replace 1st PC with pan band (after matching histograms)
vec_pan = im_pan.reshape(im_pan.shape[0] * im_pan.shape[1])
vec_pan = vec_pan[~vec_mask]
vec_pcs[:,0] = hist_match(vec_pan, vec_pcs[:,0])
vec_ms_ps = pca.inverse_transform(vec_pcs)
# reshape vector into image
vec_ms_ps_full = np.ones((len(vec_mask), im_ms.shape[2])) * np.nan
vec_ms_ps_full[~vec_mask,:] = vec_ms_ps
@ -167,7 +167,7 @@ def rescale_image_intensity(im, cloud_mask, prob_high):
Rescales the intensity of an image (multispectral or single band) by applying
a cloud mask and clipping the prob_high upper percentile. This functions allows
to stretch the contrast of an image, only for visualisation purposes.
KV WRL 2018
Arguments:
@ -178,23 +178,23 @@ def rescale_image_intensity(im, cloud_mask, prob_high):
2D cloud mask with True where cloud pixels are
prob_high: float
probability of exceedence used to calculate the upper percentile
Returns:
-----------
im_adj: np.array
The rescaled image
"""
# lower percentile is set to 0
prc_low = 0
prc_low = 0
# reshape the 2D cloud mask into a 1D vector
vec_mask = cloud_mask.reshape(im.shape[0] * im.shape[1])
# if image contains several bands, stretch the contrast for each band
if len(im.shape) > 2:
# reshape into a vector
vec = im.reshape(im.shape[0] * im.shape[1], im.shape[2])
vec = im.reshape(im.shape[0] * im.shape[1], im.shape[2])
# initiliase with NaN values
vec_adj = np.ones((len(vec_mask), im.shape[2])) * np.nan
# loop through the bands
@ -207,7 +207,7 @@ def rescale_image_intensity(im, cloud_mask, prob_high):
vec_adj[~vec_mask,i] = vec_rescaled
# reshape into image
im_adj = vec_adj.reshape(im.shape[0], im.shape[1], im.shape[2])
# if image only has 1 bands (grayscale image)
else:
vec = im.reshape(im.shape[0] * im.shape[1])
@ -221,11 +221,11 @@ def rescale_image_intensity(im, cloud_mask, prob_high):
def preprocess_single(fn, satname, cloud_mask_issue):
"""
Reads the image and outputs the pansharpened/down-sampled multispectral bands, the
georeferencing vector of the image (coordinates of the upper left pixel), the cloud mask and
the QA band. For Landsat 7-8 it also outputs the panchromatic band and for Sentinel-2 it also
Reads the image and outputs the pansharpened/down-sampled multispectral bands, the
georeferencing vector of the image (coordinates of the upper left pixel), the cloud mask and
the QA band. For Landsat 7-8 it also outputs the panchromatic band and for Sentinel-2 it also
outputs the 20m SWIR band.
KV WRL 2018
Arguments:
@ -238,41 +238,41 @@ def preprocess_single(fn, satname, cloud_mask_issue):
name of the satellite mission (e.g., 'L5')
cloud_mask_issue: boolean
True if there is an issue with the cloud mask and sand pixels are being masked on the images
Returns:
-----------
im_ms: np.array
3D array containing the pansharpened/down-sampled bands (B,G,R,NIR,SWIR1)
georef: np.array
vector of 6 elements [Xtr, Xscale, Xshear, Ytr, Yshear, Yscale] defining the
vector of 6 elements [Xtr, Xscale, Xshear, Ytr, Yshear, Yscale] defining the
coordinates of the top-left pixel of the image
cloud_mask: np.array
2D cloud mask with True where cloud pixels are
im_extra : np.array
2D array containing the 20m resolution SWIR band for Sentinel-2 and the 15m resolution
2D array containing the 20m resolution SWIR band for Sentinel-2 and the 15m resolution
panchromatic band for Landsat 7 and Landsat 8. This field is empty for Landsat 5.
im_QA: np.array
2D array containing the QA band, from which the cloud_mask can be computed.
im_nodata: np.array
2D array with True where no data values (-inf) are located
"""
#=============================================================================================#
# L5 images
#=============================================================================================#
if satname == 'L5':
# read all bands
data = gdal.Open(fn, gdal.GA_ReadOnly)
georef = np.array(data.GetGeoTransform())
bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)]
im_ms = np.stack(bands, 2)
# down-sample to 15 m (half of the original pixel size)
nrows = im_ms.shape[0]*2
ncols = im_ms.shape[1]*2
# create cloud mask
im_QA = im_ms[:,:,5]
im_ms = im_ms[:,:,:-1]
@ -284,7 +284,7 @@ def preprocess_single(fn, satname, cloud_mask_issue):
# resize the image using nearest neighbour interpolation (order 0)
cloud_mask = transform.resize(cloud_mask, (nrows, ncols), order=0, preserve_range=True,
mode='constant').astype('bool_')
# adjust georeferencing vector to the new image size
# scale becomes 15m and the origin is adjusted to the center of new top left pixel
georef[1] = 15
@ -294,58 +294,70 @@ def preprocess_single(fn, satname, cloud_mask_issue):
# check if -inf or nan values on any band and add to cloud mask
im_nodata = np.zeros(cloud_mask.shape).astype(bool)
for k in range(im_ms.shape[2]):
for k in range(im_ms.shape[2]):
im_inf = np.isin(im_ms[:,:,k], -np.inf)
im_nan = np.isnan(im_ms[:,:,k])
cloud_mask = np.logical_or(np.logical_or(cloud_mask, im_inf), im_nan)
im_nodata = np.logical_or(im_nodata, im_inf)
im_nodata = np.logical_or(np.logical_or(im_nodata, im_inf), im_nan)
# check if there are pixels with 0 intensity in the Green, NIR and SWIR bands and add those
# to the cloud mask as otherwise they will cause errors when calculating the NDWI and MNDWI
im_zeros = np.ones(cloud_mask.shape).astype(bool)
for k in [1,3,4]: # loop through the Green, NIR and SWIR bands
im_zeros = np.logical_and(np.isin(im_ms[:,:,k],0), im_zeros)
# update cloud mask and nodata
cloud_mask = np.logical_or(im_zeros, cloud_mask)
im_nodata = np.logical_or(im_zeros, im_nodata)
# no extra image for Landsat 5 (they are all 30 m bands)
im_extra = []
#=============================================================================================#
# L7 images
#=============================================================================================#
#=============================================================================================#
elif satname == 'L7':
# read pan image
fn_pan = fn[0]
data = gdal.Open(fn_pan, gdal.GA_ReadOnly)
georef = np.array(data.GetGeoTransform())
bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)]
im_pan = np.stack(bands, 2)[:,:,0]
# size of pan image
nrows = im_pan.shape[0]
ncols = im_pan.shape[1]
# read ms image
fn_ms = fn[1]
data = gdal.Open(fn_ms, gdal.GA_ReadOnly)
bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)]
im_ms = np.stack(bands, 2)
# create cloud mask
im_QA = im_ms[:,:,5]
cloud_mask = create_cloud_mask(im_QA, satname, cloud_mask_issue)
# resize the image using bilinear interpolation (order 1)
im_ms = im_ms[:,:,:5]
im_ms = transform.resize(im_ms,(nrows, ncols), order=1, preserve_range=True,
mode='constant')
# resize the image using nearest neighbour interpolation (order 0)
cloud_mask = transform.resize(cloud_mask, (nrows, ncols), order=0, preserve_range=True,
mode='constant').astype('bool_')
mode='constant').astype('bool_')
# check if -inf or nan values on any band and eventually add those pixels to cloud mask
im_nodata = np.zeros(cloud_mask.shape).astype(bool)
for k in range(im_ms.shape[2]+1):
if k == 5:
im_inf = np.isin(im_pan, -np.inf)
im_nan = np.isnan(im_pan)
else:
im_inf = np.isin(im_ms[:,:,k], -np.inf)
im_nan = np.isnan(im_ms[:,:,k])
for k in range(im_ms.shape[2]):
im_inf = np.isin(im_ms[:,:,k], -np.inf)
im_nan = np.isnan(im_ms[:,:,k])
cloud_mask = np.logical_or(np.logical_or(cloud_mask, im_inf), im_nan)
im_nodata = np.logical_or(im_nodata, im_inf)
im_nodata = np.logical_or(np.logical_or(im_nodata, im_inf), im_nan)
# check if there are pixels with 0 intensity in the Green, NIR and SWIR bands and add those
# to the cloud mask as otherwise they will cause errors when calculating the NDWI and MNDWI
im_zeros = np.ones(cloud_mask.shape).astype(bool)
for k in [1,3,4]: # loop through the Green, NIR and SWIR bands
im_zeros = np.logical_and(np.isin(im_ms[:,:,k],0), im_zeros)
# update cloud mask and nodata
cloud_mask = np.logical_or(im_zeros, cloud_mask)
im_nodata = np.logical_or(im_zeros, im_nodata)
# pansharpen Green, Red, NIR (where there is overlapping with pan band in L7)
try:
@ -355,55 +367,59 @@ def preprocess_single(fn, satname, cloud_mask_issue):
# add downsampled Blue and SWIR1 bands
im_ms_ps = np.append(im_ms[:,:,[0]], im_ms_ps, axis=2)
im_ms_ps = np.append(im_ms_ps, im_ms[:,:,[4]], axis=2)
im_ms = im_ms_ps.copy()
# the extra image is the 15m panchromatic band
im_extra = im_pan
#=============================================================================================#
# L8 images
#=============================================================================================#
#=============================================================================================#
elif satname == 'L8':
# read pan image
fn_pan = fn[0]
data = gdal.Open(fn_pan, gdal.GA_ReadOnly)
georef = np.array(data.GetGeoTransform())
bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)]
im_pan = np.stack(bands, 2)[:,:,0]
# size of pan image
nrows = im_pan.shape[0]
ncols = im_pan.shape[1]
# read ms image
fn_ms = fn[1]
data = gdal.Open(fn_ms, gdal.GA_ReadOnly)
bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)]
im_ms = np.stack(bands, 2)
# create cloud mask
im_QA = im_ms[:,:,5]
cloud_mask = create_cloud_mask(im_QA, satname, cloud_mask_issue)
# resize the image using bilinear interpolation (order 1)
im_ms = im_ms[:,:,:5]
im_ms = transform.resize(im_ms,(nrows, ncols), order=1, preserve_range=True,
mode='constant')
# resize the image using nearest neighbour interpolation (order 0)
cloud_mask = transform.resize(cloud_mask, (nrows, ncols), order=0, preserve_range=True,
mode='constant').astype('bool_')
mode='constant').astype('bool_')
# check if -inf or nan values on any band and eventually add those pixels to cloud mask
im_nodata = np.zeros(cloud_mask.shape).astype(bool)
for k in range(im_ms.shape[2]+1):
if k == 5:
im_inf = np.isin(im_pan, -np.inf)
im_nan = np.isnan(im_pan)
else:
im_inf = np.isin(im_ms[:,:,k], -np.inf)
im_nan = np.isnan(im_ms[:,:,k])
for k in range(im_ms.shape[2]):
im_inf = np.isin(im_ms[:,:,k], -np.inf)
im_nan = np.isnan(im_ms[:,:,k])
cloud_mask = np.logical_or(np.logical_or(cloud_mask, im_inf), im_nan)
im_nodata = np.logical_or(im_nodata, im_inf)
im_nodata = np.logical_or(np.logical_or(im_nodata, im_inf), im_nan)
# check if there are pixels with 0 intensity in the Green, NIR and SWIR bands and add those
# to the cloud mask as otherwise they will cause errors when calculating the NDWI and MNDWI
im_zeros = np.ones(cloud_mask.shape).astype(bool)
for k in [1,3,4]: # loop through the Green, NIR and SWIR bands
im_zeros = np.logical_and(np.isin(im_ms[:,:,k],0), im_zeros)
# update cloud mask and nodata
cloud_mask = np.logical_or(im_zeros, cloud_mask)
im_nodata = np.logical_or(im_zeros, im_nodata)
# pansharpen Blue, Green, Red (where there is overlapping with pan band in L8)
try:
@ -412,16 +428,16 @@ def preprocess_single(fn, satname, cloud_mask_issue):
im_ms_ps = im_ms[:,:,[0,1,2]]
# add downsampled NIR and SWIR1 bands
im_ms_ps = np.append(im_ms_ps, im_ms[:,:,[3,4]], axis=2)
im_ms = im_ms_ps.copy()
# the extra image is the 15m panchromatic band
im_extra = im_pan
#=============================================================================================#
# S2 images
#=============================================================================================#
#=============================================================================================#
if satname == 'S2':
# read 10m bands (R,G,B,NIR)
fn10 = fn[0]
data = gdal.Open(fn10, gdal.GA_ReadOnly)
@ -429,7 +445,7 @@ def preprocess_single(fn, satname, cloud_mask_issue):
bands = [data.GetRasterBand(k + 1).ReadAsArray() for k in range(data.RasterCount)]
im10 = np.stack(bands, 2)
im10 = im10/10000 # TOA scaled to 10000
# if image contains only zeros (can happen with S2), skip the image
if sum(sum(sum(im10))) < 1:
im_ms = []
@ -437,11 +453,11 @@ def preprocess_single(fn, satname, cloud_mask_issue):
# skip the image by giving it a full cloud_mask
cloud_mask = np.ones((im10.shape[0],im10.shape[1])).astype('bool')
return im_ms, georef, cloud_mask, [], [], []
# size of 10m bands
nrows = im10.shape[0]
ncols = im10.shape[1]
# read 20m band (SWIR1)
fn20 = fn[1]
data = gdal.Open(fn20, gdal.GA_ReadOnly)
@ -449,15 +465,15 @@ def preprocess_single(fn, satname, cloud_mask_issue):
im20 = np.stack(bands, 2)
im20 = im20[:,:,0]
im20 = im20/10000 # TOA scaled to 10000
# resize the image using bilinear interpolation (order 1)
im_swir = transform.resize(im20, (nrows, ncols), order=1, preserve_range=True,
mode='constant')
im_swir = np.expand_dims(im_swir, axis=2)
# append down-sampled SWIR1 band to the other 10m bands
im_ms = np.append(im10, im_swir, axis=2)
# create cloud mask using 60m QA band (not as good as Landsat cloud cover)
fn60 = fn[2]
data = gdal.Open(fn60, gdal.GA_ReadOnly)
@ -470,23 +486,32 @@ def preprocess_single(fn, satname, cloud_mask_issue):
mode='constant')
# check if -inf or nan values on any band and add to cloud mask
im_nodata = np.zeros(cloud_mask.shape).astype(bool)
for k in range(im_ms.shape[2]):
for k in range(im_ms.shape[2]):
im_inf = np.isin(im_ms[:,:,k], -np.inf)
im_nan = np.isnan(im_ms[:,:,k])
cloud_mask = np.logical_or(np.logical_or(cloud_mask, im_inf), im_nan)
im_nodata = np.logical_or(im_nodata, im_inf)
im_nodata = np.logical_or(np.logical_or(im_nodata, im_inf), im_nan)
# check if there are pixels with 0 intensity in the Green, NIR and SWIR bands and add those
# to the cloud mask as otherwise they will cause errors when calculating the NDWI and MNDWI
im_zeros = np.ones(cloud_mask.shape).astype(bool)
for k in [1,3,4]: # loop through the Green, NIR and SWIR bands
im_zeros = np.logical_and(np.isin(im_ms[:,:,k],0), im_zeros)
# update cloud mask and nodata
cloud_mask = np.logical_or(im_zeros, cloud_mask)
im_nodata = np.logical_or(im_zeros, im_nodata)
# the extra image is the 20m SWIR band
im_extra = im20
return im_ms, georef, cloud_mask, im_extra, im_QA, im_nodata
def create_jpg(im_ms, cloud_mask, date, satname, filepath):
"""
Saves a .jpg file with the RGB image as well as the NIR and SWIR1 grayscale images.
This functions can be modified to obtain different visualisations of the multispectral images.
KV WRL 2018
Arguments:
@ -499,18 +524,18 @@ def create_jpg(im_ms, cloud_mask, date, satname, filepath):
String containing the date at which the image was acquired
satname: str
name of the satellite mission (e.g., 'L5')
Returns:
-----------
Saves a .jpg image corresponding to the preprocessed satellite image
"""
# rescale image intensity for display purposes
im_RGB = rescale_image_intensity(im_ms[:,:,[2,1,0]], cloud_mask, 99.9)
# im_NIR = rescale_image_intensity(im_ms[:,:,3], cloud_mask, 99.9)
# im_SWIR = rescale_image_intensity(im_ms[:,:,4], cloud_mask, 99.9)
# make figure (just RGB)
fig = plt.figure()
fig.set_size_inches([18,9])
@ -519,7 +544,7 @@ def create_jpg(im_ms, cloud_mask, date, satname, filepath):
ax1.axis('off')
ax1.imshow(im_RGB)
ax1.set_title(date + ' ' + satname, fontsize=16)
# if im_RGB.shape[1] > 2*im_RGB.shape[0]:
# ax1 = fig.add_subplot(311)
# ax2 = fig.add_subplot(312)
@ -527,7 +552,7 @@ def create_jpg(im_ms, cloud_mask, date, satname, filepath):
# else:
# ax1 = fig.add_subplot(131)
# ax2 = fig.add_subplot(132)
# ax3 = fig.add_subplot(133)
# ax3 = fig.add_subplot(133)
# # RGB
# ax1.axis('off')
# ax1.imshow(im_RGB)
@ -540,18 +565,18 @@ def create_jpg(im_ms, cloud_mask, date, satname, filepath):
# ax3.axis('off')
# ax3.imshow(im_SWIR, cmap='seismic')
# ax3.set_title('Short-wave Infrared', fontsize=16)
# save figure
plt.rcParams['savefig.jpeg_quality'] = 100
fig.savefig(os.path.join(filepath,
date + '_' + satname + '.jpg'), dpi=150)
date + '_' + satname + '.jpg'), dpi=150)
plt.close()
def save_jpg(metadata, settings):
"""
Saves a .jpg image for all the images contained in metadata.
KV WRL 2018
Arguments:
@ -566,27 +591,27 @@ def save_jpg(metadata, settings):
name of the site (also name of the folder where the images are stored)
cloud_mask_issue: boolean
True if there is an issue with the cloud mask and sand pixels are being masked on the images
Returns:
-----------
"""
sitename = settings['inputs']['sitename']
cloud_thresh = settings['cloud_thresh']
filepath_data = settings['inputs']['filepath']
# create subfolder to store the jpg files
filepath_jpg = os.path.join(filepath_data, sitename, 'jpg_files', 'preprocessed')
if not os.path.exists(filepath_jpg):
if not os.path.exists(filepath_jpg):
os.makedirs(filepath_jpg)
# loop through satellite list
for satname in metadata.keys():
filepath = SDS_tools.get_filepath(settings['inputs'],satname)
filenames = metadata[satname]['filenames']
# loop through images
for i in range(len(filenames)):
# image filename
@ -602,17 +627,17 @@ def save_jpg(metadata, settings):
# save .jpg with date and satellite in the title
date = filenames[i][:19]
create_jpg(im_ms, cloud_mask, date, satname, filepath_jpg)
# print the location where the images have been saved
print('Satellite images saved as .jpg in ' + os.path.join(filepath_data, sitename,
'jpg_files', 'preprocessed'))
def get_reference_sl(metadata, settings):
"""
Allows the user to manually digitize a reference shoreline that is used seed the shoreline
Allows the user to manually digitize a reference shoreline that is used seed the shoreline
detection algorithm. The reference shoreline helps to detect the outliers, making the shoreline
detection more robust.
KV WRL 2018
Arguments:
@ -627,17 +652,17 @@ def get_reference_sl(metadata, settings):
name of the site (also name of the folder where the images are stored)
'output_epsg': int
epsg code of the desired spatial reference system
Returns:
-----------
reference_shoreline: np.array
coordinates of the reference shoreline that was manually digitized
"""
sitename = settings['inputs']['sitename']
filepath_data = settings['inputs']['filepath']
# check if reference shoreline already exists in the corresponding folder
filepath = os.path.join(filepath_data, sitename)
filename = sitename + '_reference_shoreline.pkl'
@ -646,7 +671,7 @@ def get_reference_sl(metadata, settings):
with open(os.path.join(filepath, sitename + '_reference_shoreline.pkl'), 'rb') as f:
refsl = pickle.load(f)
return refsl
else:
# first try to use S2 images (10m res for manually digitizing the reference shoreline)
if 'S2' in metadata.keys():
@ -657,62 +682,62 @@ def get_reference_sl(metadata, settings):
elif not 'S2' in metadata.keys() and 'L8' in metadata.keys():
satname = 'L8'
filepath = SDS_tools.get_filepath(settings['inputs'],satname)
filenames = metadata[satname]['filenames']
# if no S2 images and no L8, use L5 images (L7 images have black diagonal bands making it
filenames = metadata[satname]['filenames']
# if no S2 images and no L8, use L5 images (L7 images have black diagonal bands making it
# hard to manually digitize a shoreline)
elif not 'S2' in metadata.keys() and not 'L8' in metadata.keys() and 'L5' in metadata.keys():
satname = 'L5'
filepath = SDS_tools.get_filepath(settings['inputs'],satname)
filenames = metadata[satname]['filenames']
filenames = metadata[satname]['filenames']
else:
raise Exception('You cannot digitize the shoreline on L7 images, add another L8, S2 or L5 to your dataset.')
raise Exception('You cannot digitize the shoreline on L7 images, add another L8, S2 or L5 to your dataset.')
# loop trhough the images
for i in range(len(filenames)):
# read image
fn = SDS_tools.get_filenames(filenames[i],filepath, satname)
im_ms, georef, cloud_mask, im_extra, im_QA, im_nodata = preprocess_single(fn, satname, settings['cloud_mask_issue'])
# calculate cloud cover
cloud_cover = np.divide(sum(sum(cloud_mask.astype(int))),
(cloud_mask.shape[0]*cloud_mask.shape[1]))
# skip image if cloud cover is above threshold
if cloud_cover > settings['cloud_thresh']:
continue
# rescale image intensity for display purposes
im_RGB = rescale_image_intensity(im_ms[:,:,[2,1,0]], cloud_mask, 99.9)
# plot the image RGB on a figure
fig = plt.figure()
fig.set_size_inches([18,9])
fig.set_tight_layout(True)
plt.axis('off')
plt.imshow(im_RGB)
# decide if the image if good enough for digitizing the shoreline
plt.title('click <keep> if image is clear enough to digitize the shoreline.\n' +
'If not (too cloudy) click on <skip> to get another image', fontsize=14)
keep_button = plt.text(0, 0.9, 'keep', size=16, ha="left", va="top",
transform=plt.gca().transAxes,
bbox=dict(boxstyle="square", ec='k',fc='w'))
bbox=dict(boxstyle="square", ec='k',fc='w'))
skip_button = plt.text(1, 0.9, 'skip', size=16, ha="right", va="top",
transform=plt.gca().transAxes,
bbox=dict(boxstyle="square", ec='k',fc='w'))
mng = plt.get_current_fig_manager()
mng = plt.get_current_fig_manager()
mng.window.showMaximized()
# let user click on the image once
pt_input = ginput(n=1, timeout=1e9, show_clicks=False)
pt_input = np.array(pt_input)
# if clicks next to <skip>, show another image
if pt_input[0][0] > im_ms.shape[1]/2:
plt.close()
continue
else:
# remove keep and skip buttons
keep_button.set_visible(False)
@ -720,29 +745,29 @@ def get_reference_sl(metadata, settings):
# create two new buttons
add_button = plt.text(0, 0.9, 'add', size=16, ha="left", va="top",
transform=plt.gca().transAxes,
bbox=dict(boxstyle="square", ec='k',fc='w'))
bbox=dict(boxstyle="square", ec='k',fc='w'))
end_button = plt.text(1, 0.9, 'end', size=16, ha="right", va="top",
transform=plt.gca().transAxes,
bbox=dict(boxstyle="square", ec='k',fc='w'))
bbox=dict(boxstyle="square", ec='k',fc='w'))
# add multiple reference shorelines (until user clicks on <end> button)
pts_sl = np.expand_dims(np.array([np.nan, np.nan]),axis=0)
geoms = []
while 1:
add_button.set_visible(False)
end_button.set_visible(False)
end_button.set_visible(False)
# update title (instructions)
plt.title('Click points along the shoreline (enough points to capture the beach curvature).\n' +
'Start at one end of the beach.\n' + 'When finished digitizing, click <ENTER>',
fontsize=14)
fontsize=14)
plt.draw()
# let user click on the shoreline
pts = ginput(n=50000, timeout=1e9, show_clicks=True)
pts_pix = np.array(pts)
pts_pix = np.array(pts)
# convert pixel coordinates to world coordinates
pts_world = SDS_tools.convert_pix2world(pts_pix[:,[1,0]], georef)
pts_world = SDS_tools.convert_pix2world(pts_pix[:,[1,0]], georef)
# interpolate between points clicked by the user (1m resolution)
pts_world_interp = np.expand_dims(np.array([np.nan, np.nan]),axis=0)
for k in range(len(pts_world)-1):
@ -757,50 +782,50 @@ def get_reference_sl(metadata, settings):
deltay = pts_world[k+1,1] - pts_world[k,1]
phi = np.pi/2 - np.math.atan2(deltax, deltay)
tf = transform.EuclideanTransform(rotation=phi, translation=pts_world[k,:])
pts_world_interp = np.append(pts_world_interp,tf(pt_coords), axis=0)
pts_world_interp = np.append(pts_world_interp,tf(pt_coords), axis=0)
pts_world_interp = np.delete(pts_world_interp,0,axis=0)
# save as geometry (to create .geojson file later)
geoms.append(geometry.LineString(pts_world_interp))
# convert to pixel coordinates and plot
pts_pix_interp = SDS_tools.convert_world2pix(pts_world_interp, georef)
pts_sl = np.append(pts_sl, pts_world_interp, axis=0)
plt.plot(pts_pix_interp[:,0], pts_pix_interp[:,1], 'r--')
plt.plot(pts_pix_interp[0,0], pts_pix_interp[0,1],'ko')
plt.plot(pts_pix_interp[-1,0], pts_pix_interp[-1,1],'ko')
# update title and buttons
add_button.set_visible(True)
end_button.set_visible(True)
end_button.set_visible(True)
plt.title('click <add> to digitize another shoreline or <end> to finish and save the shoreline(s)',
fontsize=14)
plt.draw()
fontsize=14)
plt.draw()
# let the user click again (<add> another shoreline or <end>)
pt_input = ginput(n=1, timeout=1e9, show_clicks=False)
pt_input = np.array(pt_input)
pt_input = np.array(pt_input)
# if user clicks on <end>, save the points and break the loop
if pt_input[0][0] > im_ms.shape[1]/2:
if pt_input[0][0] > im_ms.shape[1]/2:
add_button.set_visible(False)
end_button.set_visible(False)
end_button.set_visible(False)
plt.title('Reference shoreline saved as ' + sitename + '_reference_shoreline.pkl and ' + sitename + '_reference_shoreline.geojson')
plt.draw()
ginput(n=1, timeout=3, show_clicks=False)
plt.close()
plt.close()
break
pts_sl = np.delete(pts_sl,0,axis=0)
pts_sl = np.delete(pts_sl,0,axis=0)
# convert world image coordinates to user-defined coordinate system
image_epsg = metadata[satname]['epsg'][i]
pts_coords = SDS_tools.convert_epsg(pts_sl, image_epsg, settings['output_epsg'])
# save the reference shoreline as .pkl
filepath = os.path.join(filepath_data, sitename)
with open(os.path.join(filepath, sitename + '_reference_shoreline.pkl'), 'wb') as f:
pickle.dump(pts_coords, f)
# also store as .geojson in case user wants to drag-and-drop on GIS for verification
for k,line in enumerate(geoms):
gdf = gpd.GeoDataFrame(geometry=gpd.GeoSeries(line))
@ -814,11 +839,11 @@ def get_reference_sl(metadata, settings):
gdf_all.crs = {'init':'epsg:'+str(image_epsg)}
# convert from image_epsg to user-defined coordinate system
gdf_all = gdf_all.to_crs({'init': 'epsg:'+str(settings['output_epsg'])})
# save as geojson
# save as geojson
gdf_all.to_file(os.path.join(filepath, sitename + '_reference_shoreline.geojson'),
driver='GeoJSON', encoding='utf-8')
print('Reference shoreline has been saved in ' + filepath)
break
return pts_coords
return pts_coords

@ -160,7 +160,7 @@ def classify_image_NN(im_ms, im_extra, cloud_mask, min_beach_area, clf):
###################################################################################################
# CONTOUR MAPPING FUNCTIONS
###################################################################################################
def find_wl_contours1(im_ndwi, cloud_mask, im_ref_buffer):
"""
Traditional method for shorelien detection.
@ -315,7 +315,7 @@ def find_wl_contours2(im_ms, im_labels, cloud_mask, buffer_size, im_ref_buffer):
###################################################################################################
# SHORELINE PROCESSING FUNCTIONS
###################################################################################################
def create_shoreline_buffer(im_shape, georef, image_epsg, pixel_size, settings):
"""
Creates a buffer around the reference shoreline. The size of the buffer is given by
@ -339,36 +339,36 @@ def create_shoreline_buffer(im_shape, georef, image_epsg, pixel_size, settings):
output spatial reference system
reference_shoreline: np.array
coordinates of the reference shoreline
max_dist_ref: int
max_dist_ref: int
maximum distance from the reference shoreline in metres
Returns: -----------
im_buffer: np.array
binary image, True where the buffer is, False otherwise
"""
"""
# initialise the image buffer
im_buffer = np.ones(im_shape).astype(bool)
if 'reference_shoreline' in settings.keys():
# convert reference shoreline to pixel coordinates
ref_sl = settings['reference_shoreline']
ref_sl_conv = SDS_tools.convert_epsg(ref_sl, settings['output_epsg'],image_epsg)[:,:-1]
ref_sl_pix = SDS_tools.convert_world2pix(ref_sl_conv, georef)
ref_sl_pix_rounded = np.round(ref_sl_pix).astype(int)
# create binary image of the reference shoreline (1 where the shoreline is 0 otherwise)
im_binary = np.zeros(im_shape)
for j in range(len(ref_sl_pix_rounded)):
im_binary[ref_sl_pix_rounded[j,1], ref_sl_pix_rounded[j,0]] = 1
im_binary = im_binary.astype(bool)
# dilate the binary image to create a buffer around the reference shoreline
max_dist_ref_pixels = np.ceil(settings['max_dist_ref']/pixel_size)
se = morphology.disk(max_dist_ref_pixels)
im_buffer = morphology.binary_dilation(im_binary, se)
return im_buffer
def process_shoreline(contours, cloud_mask, georef, image_epsg, settings):
@ -396,7 +396,7 @@ def process_shoreline(contours, cloud_mask, georef, image_epsg, settings):
min_length_sl: float
minimum length of shoreline perimeter to be kept (in meters)
Returns:
Returns:
-----------
shoreline: np.array
array of points with the X and Y coordinates of the shoreline
@ -422,9 +422,9 @@ def process_shoreline(contours, cloud_mask, georef, image_epsg, settings):
x_points = np.append(x_points,contours_long[k][:,0])
y_points = np.append(y_points,contours_long[k][:,1])
contours_array = np.transpose(np.array([x_points,y_points]))
shoreline = contours_array
# now remove any shoreline points that are attached to cloud pixels
if sum(sum(cloud_mask)) > 0:
# get the coordinates of the cloud pixels
@ -437,8 +437,8 @@ def process_shoreline(contours, cloud_mask, georef, image_epsg, settings):
idx_keep = np.ones(len(shoreline)).astype(bool)
for k in range(len(shoreline)):
if np.any(np.linalg.norm(shoreline[k,:] - coords_cloud, axis=1) < 30):
idx_keep[k] = False
shoreline = shoreline[idx_keep]
idx_keep[k] = False
shoreline = shoreline[idx_keep]
return shoreline
@ -471,7 +471,7 @@ def show_detection(im_ms, cloud_mask, im_labels, shoreline,image_epsg, georef,
satname: string
indicates the satname (L5,L7,L8 or S2)
Returns:
Returns:
-----------
skip_image: boolean
True if the user wants to skip the image, False otherwise.
@ -524,7 +524,7 @@ def show_detection(im_ms, cloud_mask, im_labels, shoreline,image_epsg, georef,
mng = plt.get_current_fig_manager()
mng.window.showMaximized()
# according to the image shape, decide whether it is better to have the images
# according to the image shape, decide whether it is better to have the images
# in vertical subplots or horizontal subplots
if im_RGB.shape[1] > 2*im_RGB.shape[0]:
# vertical subplots
@ -581,7 +581,7 @@ def show_detection(im_ms, cloud_mask, im_labels, shoreline,image_epsg, georef,
# if check_detection is True, let user manually accept/reject the images
skip_image = False
if settings['check_detection']:
# set a key event to accept/reject the detections (see https://stackoverflow.com/a/15033071)
# this variable needs to be immuatable so we can access it after the keypress event
key_event = {}
@ -607,7 +607,7 @@ def show_detection(im_ms, cloud_mask, im_labels, shoreline,image_epsg, georef,
btn_skip.remove()
btn_keep.remove()
btn_esc.remove()
# keep/skip image according to the pressed key, 'escape' to break the loop
if key_event.get('pressed') == 'right':
skip_image = False
@ -623,7 +623,7 @@ def show_detection(im_ms, cloud_mask, im_labels, shoreline,image_epsg, georef,
# if save_figure is True, save a .jpg under /jpg_files/detection
if settings['save_figure'] and not skip_image:
fig.savefig(os.path.join(filepath, date + '_' + satname + '.jpg'), dpi=200)
fig.savefig(os.path.join(filepath, date + '_' + satname + '.jpg'), dpi=150)
# Don't close the figure window, but remove all axes and settings, ready for next plot
for ax in fig.axes:
@ -671,7 +671,7 @@ def extract_shorelines(metadata, settings):
sitename = settings['inputs']['sitename']
filepath_data = settings['inputs']['filepath']
# initialise output structure
output = dict([])
output = dict([])
# create a subfolder to store the .jpg images showing the detection
filepath_jpg = os.path.join(filepath_data, sitename, 'jpg_files', 'detection')
if not os.path.exists(filepath_jpg):
@ -698,20 +698,22 @@ def extract_shorelines(metadata, settings):
# load classifiers and
if satname in ['L5','L7','L8']:
pixel_size = 15
if settings['dark_sand']:
pixel_size = 15
if settings['sand_color'] == 'dark':
clf = joblib.load(os.path.join(os.getcwd(), 'classifiers', 'NN_4classes_Landsat_dark.pkl'))
elif settings['sand_color'] == 'bright':
clf = joblib.load(os.path.join(os.getcwd(), 'classifiers', 'NN_4classes_Landsat_bright.pkl'))
else:
clf = joblib.load(os.path.join(os.getcwd(), 'classifiers', 'NN_4classes_Landsat.pkl'))
elif satname == 'S2':
pixel_size = 10
clf = joblib.load(os.path.join(os.getcwd(), 'classifiers', 'NN_4classes_S2.pkl'))
# convert settings['min_beach_area'] and settings['buffer_size'] from metres to pixels
# convert settings['min_beach_area'] and settings['buffer_size'] from metres to pixels
buffer_size_pixels = np.ceil(settings['buffer_size']/pixel_size)
min_beach_area_pixels = np.ceil(settings['min_beach_area']/pixel_size**2)
# loop through the images
for i in range(len(filenames)):
@ -723,21 +725,33 @@ def extract_shorelines(metadata, settings):
im_ms, georef, cloud_mask, im_extra, im_QA, im_nodata = SDS_preprocess.preprocess_single(fn, satname, settings['cloud_mask_issue'])
# get image spatial reference system (epsg code) from metadata dict
image_epsg = metadata[satname]['epsg'][i]
# define an advanced cloud mask (for L7 it takes into account the fact that diagonal
# bands of no data are not clouds)
if not satname == 'L7' or sum(sum(im_nodata)) == 0 or sum(sum(im_nodata)) > 0.5*im_nodata.size:
cloud_mask_adv = cloud_mask
else:
cloud_mask_adv = np.logical_xor(cloud_mask, im_nodata)
# calculate cloud cover
cloud_cover = np.divide(sum(sum(cloud_mask.astype(int))),
cloud_cover = np.divide(sum(sum(cloud_mask_adv.astype(int))),
(cloud_mask.shape[0]*cloud_mask.shape[1]))
# skip image if cloud cover is above threshold
if cloud_cover > settings['cloud_thresh']:
continue
# classify image in 4 classes (sand, whitewater, water, other) with NN classifier
im_classif, im_labels = classify_image_NN(im_ms, im_extra, cloud_mask,
min_beach_area_pixels, clf)
# calculate a buffer around the reference shoreline (if any has been digitised)
im_ref_buffer = create_shoreline_buffer(cloud_mask.shape, georef, image_epsg,
pixel_size, settings)
# when running the automated mode, skip image if cloudy pixels are found in the shoreline buffer
if not settings['check_detection'] and 'reference_shoreline' in settings.keys():
if sum(sum(np.logical_and(im_ref_buffer, cloud_mask_adv))) > 0:
continue
# classify image in 4 classes (sand, whitewater, water, other) with NN classifier
im_classif, im_labels = classify_image_NN(im_ms, im_extra, cloud_mask,
min_beach_area_pixels, clf)
# there are two options to map the contours:
# if there are pixels in the 'sand' class --> use find_wl_contours2 (enhanced)
# otherwise use find_wl_contours2 (traditional)
@ -756,7 +770,7 @@ def extract_shorelines(metadata, settings):
continue
# process the water contours into a shoreline
shoreline = process_shoreline(contours_mwi, cloud_mask, georef, image_epsg, settings)
shoreline = process_shoreline(contours_mwi, cloud_mask, georef, image_epsg, settings)
# visualise the mapped shorelines, there are two options:
# if settings['check_detection'] = True, shows the detection to the user for accept/reject
@ -804,7 +818,7 @@ def extract_shorelines(metadata, settings):
gdf = SDS_tools.output_to_gdf(output)
# set projection
gdf.crs = {'init':'epsg:'+str(settings['output_epsg'])}
# save as geojson
# save as geojson
gdf.to_file(os.path.join(filepath, sitename + '_output.geojson'), driver='GeoJSON', encoding='utf-8')
return output

@ -464,23 +464,26 @@ def output_to_gdf(output):
"""
# loop through the mapped shorelines
counter = 0
for i in range(len(output['shorelines'])):
# skip if there shoreline is empty
if len(output['shorelines'][i]) == 0:
continue
# save the geometry + attributes
geom = geometry.LineString(output['shorelines'][i])
gdf = gpd.GeoDataFrame(geometry=gpd.GeoSeries(geom))
gdf.index = [i]
gdf.loc[i,'date'] = output['dates'][i].strftime('%Y-%m-%d %H:%M:%S')
gdf.loc[i,'satname'] = output['satname'][i]
gdf.loc[i,'geoaccuracy'] = output['geoaccuracy'][i]
gdf.loc[i,'cloud_cover'] = output['cloud_cover'][i]
# store into geodataframe
if i == 0:
gdf_all = gdf
else:
gdf_all = gdf_all.append(gdf)
# save the geometry + attributes
geom = geometry.LineString(output['shorelines'][i])
gdf = gpd.GeoDataFrame(geometry=gpd.GeoSeries(geom))
gdf.index = [i]
gdf.loc[i,'date'] = output['dates'][i].strftime('%Y-%m-%d %H:%M:%S')
gdf.loc[i,'satname'] = output['satname'][i]
gdf.loc[i,'geoaccuracy'] = output['geoaccuracy'][i]
gdf.loc[i,'cloud_cover'] = output['cloud_cover'][i]
# store into geodataframe
if counter == 0:
gdf_all = gdf
else:
gdf_all = gdf_all.append(gdf)
counter = counter + 1
return gdf_all

@ -71,7 +71,7 @@ settings = {
'buffer_size': 150, # radius (in metres) of the buffer around sandy pixels considered in the shoreline detection
'min_length_sl': 200, # minimum length (in metres) of shoreline perimeter to be valid
'cloud_mask_issue': False, # switch this parameter to True if sand pixels are masked (in black) on many images
'dark_sand': False, # only switch to True if your site has dark sand (e.g. black sand beach)
'sand_color': 'default', # 'default', 'dark' (for grey/black sand beaches) or 'bright' (for white sand beaches)
}
# [OPTIONAL] preprocess images (cloud masking, pansharpening/down-sampling)

@ -6,14 +6,16 @@
"source": [
"# *CoastSat*: example at Narrabeen-Collaroy, Australia\n",
"\n",
"This software is described in *Vos K., Splinter K.D., Harley M.D., Simmons J.A., Turner I.L. (submitted). CoastSat: a Google Earth Engine-enabled software to extract shorelines from publicly available satellite imagery, Environmental Modelling and Software*. It enables the users to extract time-series of shoreline change over the last 30+ years at their site of interest.\n",
"This software is described in details in:\n",
"* Vos K., Splinter K.D., Harley M.D., Simmons J.A., Turner I.L. (2019). CoastSat: a Google Earth Engine-enabled Python toolkit to extract shorelines from publicly available satellite imagery. Environmental Modelling and Software. 122, 104528. https://doi.org/10.1016/j.envsoft.2019.104528\n",
"\n",
"It enables the users to extract time-series of shoreline change over the last 30+ years at their site of interest.\n",
"There are three main steps:\n",
"- retrieval of the satellite images of the region of interest from Google Earth Engine\n",
"- extraction of the shorelines from the images using a sub-pixel resolution technique\n",
"- intersection of the 2D shorelines with shore-normal transects\n",
"1. Retrieval of the satellite images of the region of interest from Google Earth Engine\n",
"2. Shoreline extraction at sub-pixel resolution\n",
"3. Intersection of the shorelines with cross-shore transects\n",
"\n",
"## 1. Initial settings\n",
"## Initial settings\n",
"\n",
"Refer to the **Installation** section of the README for instructions on how to install the Python packages necessary to run the software, including Google Earth Engine Python API. If that step has been completed correctly, the following packages should be imported without any problem."
]
@ -37,7 +39,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Retrieval of the images from GEE\n",
"## 1. Retrieval of the images from GEE\n",
"\n",
"Define the region of interest (`polygon`), the date range (`dates`) and the satellite missions (`sat_list`) from which you wish to retrieve the satellite images. The images will be cropped on the Google Earth Engine server and only the region of interest will be downloaded as a .tif file. The files will stored in the directory defined in `filepath`.\n",
"\n",
@ -104,7 +106,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3. Shoreline extraction\n",
"## 2. Shoreline extraction\n",
"\n",
"This section maps the position of the shoreline on the satellite images. The user can define the cloud threhold (`cloud_thresh`) and select the spatial reference system in which to output the coordinates of the mapped shorelines (`output_epsg`). See http://spatialreference.org/ to find the EPSG number corresponding to your local coordinate system. Make sure that your are using cartesian coordinates and not spherical coordinates (lat,lon) like WGS84. \n",
"\n",
@ -133,7 +135,7 @@
" 'buffer_size': 150, # radius (in metres) of the buffer around sandy pixels considered in the shoreline detection\n",
" 'min_length_sl': 200, # minimum length (in metres) of shoreline perimeter to be valid\n",
" 'cloud_mask_issue': False, # switch this parameter to True if sand pixels are masked (in black) on many images \n",
" 'dark_sand': False, # only switch to True if your site has dark sand (e.g. black sand beach)\n",
" 'sand_color': 'default', # 'default', 'dark' (for grey/black sand beaches) or 'bright' (for white sand beaches)\n",
"}"
]
},
@ -178,7 +180,9 @@
"metadata": {},
"source": [
"### Batch shoreline detection\n",
"Extracts the 2D shorelines from the images in the spatial reference system specified by the user in `'output_epsg'`. The mapped shorelines are saved into `output.pkl` (under *./data/sitename*) and `output.geojson` to use in GIS softwares."
"Extracts the 2D shorelines from the images in the spatial reference system specified by the user in `'output_epsg'`. The mapped shorelines are saved into `output.pkl` (under *./data/sitename*) and `output.geojson` (to be used in a GIS software).\n",
"\n",
"If you see that the sand pixels on the images are not being identified, change the parameter `sand_color` from `default` to `dark` or `bright` depending on the color of your beach. "
]
},
{
@ -225,7 +229,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Shoreline analysis\n",
"## 3. Shoreline analysis\n",
"\n",
"In this section we show how to compute time-series of cross-shore distance along user-defined shore-normal transects."
]
@ -371,7 +375,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
"version": "3.6.7"
},
"varInspector": {
"cols": {

Binary file not shown.

Before

Width:  |  Height:  |  Size: 16 MiB

Loading…
Cancel
Save