Merge branch 'develop'

master
Chris Leaman 6 years ago
commit 2a27f4e535

@ -1,30 +1,56 @@
# 2016 Narrabeen Storm EWS Performance
This repository investigates whether the storm impacts (i.e. Sallenger, 2000) of the June 2016 Narrabeen Storm could have been forecasted in advance.
This repository investigates whether the storm impacts (i.e. Sallenger, 2000) of the June 2016 Narrabeen Storm could
have been forecasted in advance.
## Repository and analysis format
This repository follows the [Cookiecutter Data Science](https://drivendata.github.io/cookiecutter-data-science/)
structure where possible. The analysis is done in python (look at the `/src/` folder) with some interactive, exploratory notebooks located at `/notebooks`.
structure where possible. The analysis is done in python (look at the `/src/` folder) with some interactive,
exploratory notebooks located at `/notebooks`.
Development is conducted using a [gitflow](https://www.atlassian
.com/git/tutorials/comparing-workflows/gitflow-workflow) approach - mainly the `master` branch stores the official
release history and the `develop` branch serves as an integration branch for features. Other `hotfix` and `feature`
branches should be created and merged as necessary.
## Where to start?
1. Clone this repository.
2. Pull data from WRL coastal J drive with `make pull-data`
3. Check out jupyter notebook `./notebooks/01_exploration.ipynb` which has an example of how to import the data and some interactive widgets.
3. Check out jupyter notebook `./notebooks/01_exploration.ipynb` which has an example of how to import the data and
some interactive widgets.
## Requirements
The following requirements are needed to run various bits:
- [Python 3.6+](https://conda.io/docs/user-guide/install/windows.html): Used for processing and analysing data. Jupyter notebooks are used for exploratory analyis and communication.
- [QGIS](https://www.qgis.org/en/site/forusers/download): Used for looking at raw LIDAR pre/post storm surveys and extracting dune crests/toes
- [rclone](https://rclone.org/downloads/): Data is not tracked by this repository, but is backed up to a remote Chris Leaman working directory located on the WRL coastal drive. Rclone is used to sync local and remote copies. Ensure rclone.exe is located on your `PATH` environment.
- [gnuMake](http://gnuwin32.sourceforge.net/packages/make.htm): A list of commands for processing data is provided in the `./Makefile`. Use gnuMake to launch these commands. Ensure make.exe is located on your `PATH` environment.
- [Python 3.6+](https://conda.io/docs/user-guide/install/windows.html): Used for processing and analysing data.
Jupyter notebooks are used for exploratory analyis and communication.
- [QGIS](https://www.qgis.org/en/site/forusers/download): Used for looking at raw LIDAR pre/post storm surveys and
extracting dune crests/toes
- [rclone](https://rclone.org/downloads/): Data is not tracked by this repository, but is backed up to a remote
Chris Leaman working directory located on the WRL coastal drive. Rclone is used to sync local and remote copies.
Ensure rclone.exe is located on your `PATH` environment.
- [gnuMake](http://gnuwin32.sourceforge.net/packages/make.htm): A list of commands for processing data is provided in
the `./Makefile`. Use gnuMake to launch these commands. Ensure make.exe is located on your `PATH` environment.
## Available data
Raw, interim and processed data used in this analysis is kept in the `/data/` folder. Data is not tracked in the repository due to size constraints, but stored locally. A mirror is kept of the coastal folder J drive which you can use to push/pull to, using rclone. In order to get the data, run `make pull-data`.
Raw, interim and processed data used in this analysis is kept in the `/data/` folder. Data is not tracked in the
repository due to size constraints, but stored locally. A mirror is kept of the coastal folder J drive which you can
use to push/pull to, using rclone. In order to get the data, run `make pull-data`.
List of data:
- `/data/raw/processed_shorelines`: This data was recieved from Tom Beuzen in October 2018. It consists of pre/post storm profiles at every 100 m sections along beaches ranging from Dee Why to Nambucca . Profiles are based on raw aerial LIDAR and were processed by Mitch Harley. Tides and waves (10 m contour and reverse shoaled deepwater) for each individual 100 m section is also provided.
- `/data/raw/raw_lidar`: This is the raw pre/post storm aerial LIDAR which was taken for the June 2016 storm. `.las` files are the raw files which have been processed into `.tiff` files using `PDAL`. Note that these files have not been corrected for systematic errors, so actual elevations should be taken from the `processed_shorelines` folder. Obtained November 2018 from Mitch Harley from the black external HDD labeled "UNSW LIDAR".
- `/data/raw/profile_features`: Dune toe and crest locations based on prestorm LIDAR. Refer to `/notebooks/qgis.qgz` as this shows how they were manually extracted. Note that the shapefiles only show the location (lat/lon) of the dune crest and toe. For actual elevations, these locations need to related to the processed shorelines.
- `/data/raw/processed_shorelines`: This data was recieved from Tom Beuzen in October 2018. It consists of pre/post
storm profiles at every 100 m sections along beaches ranging from Dee Why to Nambucca . Profiles are based on raw
aerial LIDAR and were processed by Mitch Harley. Tides and waves (10 m contour and reverse shoaled deepwater) for
each individual 100 m section is also provided.
- `/data/raw/raw_lidar`: This is the raw pre/post storm aerial LIDAR which was taken for the June 2016 storm. `.las`
files are the raw files which have been processed into `.tiff` files using `PDAL`. Note that these files have not
been corrected for systematic errors, so actual elevations should be taken from the `processed_shorelines` folder.
Obtained November 2018 from Mitch Harley from the black external HDD labeled "UNSW LIDAR".
- `/data/raw/profile_features`: Dune toe and crest locations based on prestorm LIDAR. Refer to `/notebooks/qgis.qgz`
as this shows how they were manually extracted. Note that the shapefiles only show the location (lat/lon) of the dune
crest and toe. For actual elevations, these locations need to related to the processed shorelines.
## Notebooks
- `/notebooks/01_exploration.ipynb`: Shows how to import processed shorelines, waves and tides. An interactive widget plots the location and cross sections.
- `/notebooks/qgis.qgz`: A QGIS file which is used to explore the aerial LIDAR data in `/data/raw/raw_lidar`. By examining the pre-strom lidar, dune crest and dune toe lines are manually extracted. These are stored in the `/data/profile_features/`.
- `/notebooks/01_exploration.ipynb`: Shows how to import processed shorelines, waves and tides. An interactive widget
plots the location and cross sections.
- `/notebooks/qgis.qgz`: A QGIS file which is used to explore the aerial LIDAR data in `/data/raw/raw_lidar`. By
examining the pre-strom lidar, dune crest and dune toe lines are manually extracted. These are stored in the
`/data/profile_features/`.

File diff suppressed because one or more lines are too long

@ -0,0 +1,32 @@
"""
Compares forecasted and observed impacts, putting them into one data frame and exporting the results.
"""
import logging.config
import os
import pandas as pd
logging.config.fileConfig('./src/logging.conf', disable_existing_loggers=False)
logger = logging.getLogger(__name__)
def compare_impacts(df_forecasted, df_observed):
"""
Merge forecasted and observed storm impacts
:param df_forecasted:
:param df_observed:
:return:
"""
df_compared = df_forecasted.merge(df_observed, left_index=True, right_index=True,
suffixes=['_forecasted', '_observed'])
return df_compared
if __name__ == '__main__':
logger.info('Importing existing data')
data_folder = './data/interim'
df_forecasted = pd.read_csv(os.path.join(data_folder, 'impacts_forecasted_mean_slope_sto06.csv'), index_col=[0])
df_observed = pd.read_csv(os.path.join(data_folder, 'impacts_observed.csv'), index_col=[0])
df_compared = compare_impacts(df_forecasted, df_observed)
df_compared.to_csv(os.path.join(data_folder, 'impacts_observed_vs_forecasted_mean_slope_sto06.csv'))

@ -0,0 +1,73 @@
"""
Estimates the forecasted storm impacts based on the forecasted water level and dune crest/toe.
"""
import logging.config
import os
import pandas as pd
logging.config.fileConfig('./src/logging.conf', disable_existing_loggers=False)
logger = logging.getLogger(__name__)
def forecasted_impacts(df_profile_features, df_forecasted_twl):
"""
Combines our profile features (containing dune toes and crests) with water levels, to get the forecasted storm
impacts.
:param df_profile_features:
:param df_forecasted_twl:
:return:
"""
logger.info('Getting forecasted storm regimes')
df_forecasted_impacts = pd.DataFrame(index=df_profile_features.index)
# For each site, find the maximum R_high value and the corresponding R_low value.
idx = df_forecasted_twl.groupby(level=['site_id'])['R_high'].idxmax().dropna()
df_r_vals = df_forecasted_twl.loc[idx, ['R_high', 'R_low']].reset_index(['datetime'])
df_forecasted_impacts = df_forecasted_impacts.merge(df_r_vals, how='left', left_index=True, right_index=True)
# Join with df_profile features to find dune toe and crest elevations
df_forecasted_impacts = df_forecasted_impacts.merge(df_profile_features[['dune_toe_z', 'dune_crest_z']],
how='left',
left_index=True,
right_index=True)
# Compare R_high and R_low wirth dune crest and toe elevations
df_forecasted_impacts = storm_regime(df_forecasted_impacts)
return df_forecasted_impacts
def storm_regime(df_forecasted_impacts):
"""
Returns the dataframe with an additional column of storm impacts based on the Storm Impact Scale. Refer to
Sallenger (2000) for details.
:param df_forecasted_impacts:
:return:
"""
logger.info('Getting forecasted storm regimes')
df_forecasted_impacts.loc[
df_forecasted_impacts.R_high <= df_forecasted_impacts.dune_toe_z, 'storm_regime'] = 'swash'
df_forecasted_impacts.loc[
df_forecasted_impacts.dune_toe_z <= df_forecasted_impacts.R_high, 'storm_regime'] = 'collision'
df_forecasted_impacts.loc[(df_forecasted_impacts.dune_crest_z <= df_forecasted_impacts.R_high) &
(df_forecasted_impacts.R_low <= df_forecasted_impacts.dune_crest_z),
'storm_regime'] = 'overwash'
df_forecasted_impacts.loc[(df_forecasted_impacts.dune_crest_z <= df_forecasted_impacts.R_low) &
(df_forecasted_impacts.dune_crest_z <= df_forecasted_impacts.R_high),
'storm_regime'] = 'inundation'
return df_forecasted_impacts
if __name__ == '__main__':
logger.info('Importing existing data')
data_folder = './data/interim'
df_profiles = pd.read_csv(os.path.join(data_folder, 'profiles.csv'), index_col=[0, 1, 2])
df_profile_features = pd.read_csv(os.path.join(data_folder, 'profile_features.csv'), index_col=[0])
df_forecasted_twl = pd.read_csv(os.path.join(data_folder, 'twl_mean_slope_sto06.csv'), index_col=[0, 1])
df_forecasted_impacts = forecasted_impacts(df_profile_features, df_forecasted_twl)
df_forecasted_impacts.to_csv(os.path.join(data_folder, 'impacts_forecasted_mean_slope_sto06.csv'))

@ -0,0 +1,137 @@
import logging.config
import os
import numpy as np
import pandas as pd
from scipy.integrate import simps
logging.config.fileConfig('./src/logging.conf', disable_existing_loggers=False)
logger = logging.getLogger(__name__)
def return_first_or_nan(l):
"""
Returns the first value of a list if empty or returns nan. Used for getting dune/toe and crest values.
:param l:
:return:
"""
if len(l) == 0:
return np.nan
else:
return l[0]
def volume_change(df_profiles, df_profile_features, zone):
"""
Calculates how much the volume change there is between prestrom and post storm profiles.
:param df_profiles:
:param df_profile_features:
:param zone: Either 'swash' or 'dune_face'
:return:
"""
logger.info('Calculating change in beach volume in {} zone'.format(zone))
df_vol_changes = pd.DataFrame(index=df_profile_features.index)
df_profiles = df_profiles.sort_index()
sites = df_profiles.groupby(level=['site_id'])
for site_id, df_site in sites:
logger.debug('Calculating change in beach volume at {} in {} zone'.format(site_id, zone))
prestorm_dune_toe_x = df_profile_features.loc[df_profile_features.index == site_id].dune_toe_x.tolist()
prestorm_dune_crest_x = df_profile_features.loc[df_profile_features.index == site_id].dune_crest_x.tolist()
# We may not have a dune toe or crest defined, or there may be multiple defined.
prestorm_dune_crest_x = return_first_or_nan(prestorm_dune_crest_x)
prestorm_dune_toe_x = return_first_or_nan(prestorm_dune_toe_x)
# If no dune to has been defined, Dlow = Dhigh. Refer to Sallenger (2000).
if np.isnan(prestorm_dune_toe_x):
prestorm_dune_toe_x = prestorm_dune_crest_x
# Find last x coordinate where we have both prestorm and poststorm measurements. If we don't do this,
# the prestorm and poststorm values are going to be calculated over different lengths.
df_zone = df_site.dropna(subset=['z'])
x_last_obs = min([max(df_zone.query("profile_type == '{}'".format(profile_type)).index.get_level_values('x'))
for profile_type in ['prestorm', 'poststorm']])
# Where we want to measure pre and post storm volume is dependant on the zone selected
if zone == 'swash':
x_min = prestorm_dune_toe_x
x_max = x_last_obs
elif zone == 'dune_face':
x_min = prestorm_dune_crest_x
x_max = prestorm_dune_toe_x
else:
logger.warning('Zone argument not properly specified. Please check')
x_min = None
x_max = None
# Now, compute the volume of sand between the x-coordinates prestorm_dune_toe_x and x_swash_last for both prestorm
# and post storm profiles.
prestorm_vol = beach_volume(x=df_zone.query("profile_type=='prestorm'").index.get_level_values('x'),
z=df_zone.query("profile_type=='prestorm'").z,
x_min=x_min,
x_max=x_max)
poststorm_vol = beach_volume(x=df_zone.query("profile_type=='poststorm'").index.get_level_values('x'),
z=df_zone.query("profile_type=='poststorm'").z,
x_min=x_min,
x_max=x_max)
df_vol_changes.loc[site_id, 'prestorm_{}_vol'.format(zone)] = prestorm_vol
df_vol_changes.loc[site_id, 'poststorm_{}_vol'.format(zone)] = poststorm_vol
df_vol_changes.loc[site_id, '{}_vol_change'.format(zone)] = prestorm_vol - poststorm_vol
return df_vol_changes
def beach_volume(x, z, x_min=np.NINF, x_max=np.inf):
"""
Returns the beach volume of a profile, calculated with Simpsons rule
:param x: x-coordinates of beach profile
:param z: z-coordinates of beach profile
:param x_min: Minimum x-coordinate to consider when calculating volume
:param x_max: Maximum x-coordinate to consider when calculating volume
:return:
"""
profile_mask = [True if x_min < x_coord < x_max else False for x_coord in x]
x_masked = np.array(x)[profile_mask]
z_masked = np.array(z)[profile_mask]
if len(x_masked) == 0 or len(z_masked) == 0:
return np.nan
else:
return simps(z_masked, x_masked)
def storm_regime(df_observed_impacts):
"""
Returns the dataframe with an additional column of storm impacts based on the Storm Impact Scale. Refer to
Sallenger (2000) for details.
:param df_observed_impacts:
:return:
"""
logger.info('Getting observed storm regimes')
df_observed_impacts.loc[df_observed_impacts.swash_vol_change < 3, 'storm_regime'] = 'swash'
df_observed_impacts.loc[df_observed_impacts.dune_face_vol_change > 3, 'storm_regime'] = 'collision'
return df_observed_impacts
if __name__ == '__main__':
logger.info('Importing existing data')
data_folder = './data/interim'
df_profiles = pd.read_csv(os.path.join(data_folder, 'profiles.csv'), index_col=[0, 1, 2])
df_profile_features = pd.read_csv(os.path.join(data_folder, 'profile_features.csv'), index_col=[0])
logger.info('Creating new dataframe for observed impacts')
df_observed_impacts = pd.DataFrame(index=df_profile_features.index)
logger.info('Getting pre/post storm volumes')
df_swash_vol_changes = volume_change(df_profiles, df_profile_features, zone='swash')
df_dune_face_vol_changes = volume_change(df_profiles, df_profile_features, zone='dune_face')
df_observed_impacts = df_observed_impacts.join([df_swash_vol_changes, df_dune_face_vol_changes])
# Classify regime based on volume changes
df_observed_impacts = storm_regime(df_observed_impacts)
# Save dataframe to csv
df_observed_impacts.to_csv(os.path.join(data_folder, 'impacts_observed.csv'))

@ -7,6 +7,7 @@ from datetime import datetime, timedelta
import pandas as pd
from mat4py import loadmat
import numpy as np
logging.config.fileConfig('./src/logging.conf', disable_existing_loggers=False)
logger = logging.getLogger(__name__)
@ -152,6 +153,25 @@ def parse_profiles(profiles_mat):
df = pd.DataFrame(rows)
return df
def remove_zeros(df_profiles):
"""
When parsing the pre/post storm profiles, the end of some profiles have constant values of zero. Let's change
these to NaNs for consistancy. Didn't use pandas fillnan because 0 may still be a valid value.
:param df:
:return:
"""
df_profiles = df_profiles.sort_index()
groups = df_profiles.groupby(level=['site_id','profile_type'])
for key, _ in groups:
logger.debug('Removing zeros from {} profile at {}'.format(key[1], key[0]))
idx_site = (df_profiles.index.get_level_values('site_id') == key[0]) & \
(df_profiles.index.get_level_values('profile_type') == key[1])
df_profile = df_profiles[idx_site]
x_last_ele = df_profile[df_profile.z!=0].index.get_level_values('x')[-1]
df_profiles.loc[idx_site & (df_profiles.index.get_level_values('x')>x_last_ele), 'z'] = np.nan
return df_profiles
def matlab_datenum_to_datetime(matlab_datenum):
# https://stackoverflow.com/a/13965852
@ -228,6 +248,9 @@ def main():
df_tides.set_index(['site_id', 'datetime'], inplace=True)
df_sites.set_index(['site_id'], inplace=True)
logger.info('Nanning profile zero elevations')
df_profiles = remove_zeros(df_profiles)
logger.info('Outputting .csv files')
df_profiles.to_csv('./data/interim/profiles.csv')
df_tides.to_csv('./data/interim/tides.csv')

Loading…
Cancel
Save