{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# LSTM networks module" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "LSTM networks are complex Deep Learning models that are extremely powerful at detecting patterns from various time series, and especially good at finding relationships between various timeseries. This is perfect in a hydrological modelling context, as these models can learn patterns between meteorological variables (such as precipitation, temperature and wind speed timeseries) and streamflow. Furthermore, when we also add catchment descriptors, LSTM models can learn hydrograph dynamics according to catchment attributes such as the area, slope, land-use and other such descriptors.\n", "\n", "This page demonstrates how to use `xhydro` to perform simple LSTM modelling on local (one) and regional (multiple) catchments. However, LSTM models are infinitely flexible and it would be a monumental task to expose all possible hyperparameters and modelling decisions through an interface. Therefore, this package can be seen as a starting point to develop custom models on custom data. Eventually, users will want to add codes, models and methods to the package. Official models from research projects could be added, and any model used to generate operational datasets could be implemented as well. For the time being, we will use extremely simple models to show how the code works, and we can then let users explore and add functionalities as needed." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "2024-07-03 14:26:44.776201: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n", "2024-07-03 14:26:44.801639: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n", "To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n", "2024-07-03 14:26:45.225103: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT\n" ] } ], "source": [ "import os\n", "\n", "import pooch\n", "import tensorflow.keras.backend as K\n", "import xarray as xr\n", "\n", "from xhydro_lstm.lstm_controller import (\n", " control_local_lstm_training,\n", " control_regional_lstm_training,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Example for application on a single catchment\n", "\n", "For this example, we will explore some existing data for a single catchment and see how we can use it for LSTM modelling. We will first get the data from the `xhydro-testdata` test data repository, and we will then process the data while explaining what is going on in the backend. This package is optimized for use with .nc data, and more precisely xarray.Dataset formats." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset> Size: 1MB\n",
"Dimensions: (time: 12419, watershed: 1)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 99kB 1979-01-01 ... 2012-12-31\n",
" * watershed (watershed) <U7 28B '01BG005'\n",
"Data variables: (12/19)\n",
" t2m (time, watershed) float64 99kB ...\n",
" tp (time, watershed) float64 99kB ...\n",
" sd (time, watershed) float64 99kB ...\n",
" sp (time, watershed) float64 99kB ...\n",
" d2m (time, watershed) float64 99kB ...\n",
" tcc (time, watershed) float64 99kB ...\n",
" ... ...\n",
" latitude (watershed) float64 8B ...\n",
" forest (watershed) float64 8B ...\n",
" crops (watershed) float64 8B ...\n",
" shrubs (watershed) float64 8B ...\n",
" elevation (watershed) float64 8B ...\n",
" slope (watershed) float64 8B ...<xarray.Dataset> Size: 3MB\n",
"Dimensions: (time: 12419, watershed: 3)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 99kB 1979-01-01 ... 2012-12-31\n",
" * watershed (watershed) <U7 84B '01BG005' '01BG009' '02QC009'\n",
"Data variables: (12/19)\n",
" t2m (time, watershed) float64 298kB ...\n",
" tp (time, watershed) float64 298kB ...\n",
" sd (time, watershed) float64 298kB ...\n",
" sp (time, watershed) float64 298kB ...\n",
" d2m (time, watershed) float64 298kB ...\n",
" tcc (time, watershed) float64 298kB ...\n",
" ... ...\n",
" latitude (watershed) float64 24B ...\n",
" forest (watershed) float64 24B ...\n",
" crops (watershed) float64 24B ...\n",
" shrubs (watershed) float64 24B ...\n",
" elevation (watershed) float64 24B ...\n",
" slope (watershed) float64 24B ...