Network Assessment Using Geostatistical Approach

Abstract

Rainfall data are very crucial for hydrological analysis, planning, engineering and management of water especially in areas where water scarcity prevents agricultural growth and is not enough for daily uses for humans. Rain gauge networks are a source of information for rainfall estimation over an area. Inappropriate position and density of the rain gauge stations brings the necessity of estimating rainfall data in unrecorded points and generalizing point data to regional data. Aim of this thesis is to evaluate performance of an existing rain gauge network in southern part of Sinai and identify changes necessary for improvement of current network by quantifying performance of the network and adding or relocating current rain gauges for optimization purposes. The research is conducted using geostatistical methods suitable with the current study area and scenario.

Introduction

Rainfall data is crucial for agriculture management, engineering designs for water control structures, flood control and water resource management.“Hydrologic analysis and designs rely significantly on measured rainfall data, including intense storms and flash floods identification based on high resolution estimates of spatial variability in precipitation map” (Putthividhya and Tanaka, 2012). The amount and intensity of rainfall varies by seasons and geographic location. To manage rainfall water, rainfall depth data is recorded in rain gauge stations which is used to determine the average rainfall in an area and helps classify and predict storms based on duration, return period and intensity.Rain gauge networks are designed to give estimation of spatial-temporal variation of rainfall over an area. Accuracy of the rainfall estimation in an area depends on density and locations of the gauge stations in the network. “The design of rain-gauge networks is motivated by the need to accurately capture the areal average rainfall in basins” (Shafiei et al., 2014a). Inappropriate position and density of the rain gauge stations brings the necessity of estimating rainfall data in unrecorded points and generalizing point data to regional data. Geostatistics is increasingly preferred because it allows one to capitalize on the spatial correlation between neighboring observations to predict attribute values at unsampled locations (Goovaerts, 2000). “Geostatistics is a branch of statistics that specializes in the analysis and interpretation of any spatially (and temporally) referenced data, but with a focus on inherently continuous features (spatial fields)” (Hengl, 2009).

Weather Radars provides better spatio-temporal resolution for rainfall data. Radar estimates can be biased (because of a bright band, for example). Such biases can lead to possibly large errors in hydrological simulation values (Berne and Krajewski, 2013). Since weather radars still do not cover whole world, rain gauge networks are still of importance to be evaluated. (Shafiei et al., 2014a).Rain gauge networks are the only source of rainfall data in some areas as in southern Sinai. Since Sinai Peninsula is in an arid region, it is facing severe water scarcity to an extent of preventing the development of animal life and growth of agriculture. Contrarily, flash floods are occurred due to heavy rainfall cause by synoptic situations that leads to severe damage (Morsy et al., 2017). Water scarcity in southern Sinai motivates the assessment of available rain gauge network to plan water resources and determine necessary changes required in the network for optimization purposes. The distribution of rainfall stations is not well for hydrological calculation therefore it will be needed to establish new stations using probability and statistical approach.

Study Area

Sinai Peninsula is the only Asian part of Egypt. It lays as a land bridge connecting Asia and Africa. It has an area of 61000km bordered with Mediterranean sea from north and Gulf of Suez and Gulf of Aqaba from southwest and southeast. It is linked to the African continent by the Isthmus of Suez, 125 kilometers wide strip of land, containing the Suez Canal. The eastern isthmus, linking it to the Asian mainland, is around 200 kilometers wide. It is considered as one of the coldest provinces in Egypt because of its mountainous topography with two famous mountains, Mount Sinai that rises 2285 meters and Mount Catherine whose summit reaches 2637 meters. The climate of Sinai that is generally hot and dry receives an average rainfall less that 100mm per year. Sinai Peninsula is classified as arid land (desert) since the surface water loss due to the evaporation exceeds the surface water gain from precipitation. As a result, there is no enough fresh water to support agriculture and human needs (Morsy et al., 2017).

Basic Background for use of Geostatistics

Measurement related to earth and environmental sciences has spatio-temporal reference which is determined by geographic location, elevation above the ground surface, time and spatio-temporal support. The science that provides methodological solution for spatio-temporal referenced data is known as spatio-temporal data analysis (STDA) (Hengl, 2009). STDA is roughly considered a combination of geoinformation and spatio-temporal statistics.“Geostatistics is a subset of statistics specialized in analysis and interpretation of geographically referenced data”(Goovaerts, 1997). Geostatistics science was originated from mining industry when classical statistics were known unsuitable for ore reservoir estimations. Since then, the use of Geostatistics has rapidly evolved not only in mining but to an extent of many other areas of geosciences. Environmental variables vary in spatial and temporal aspects and can belong to different domains e.g. biology, climatology, hydrology etc. Environmental variables are quantitative or descriptive measures of different environmental features(Hengl, 2009).

Considering the limited possibilities in the past and economical aspects, the variable’s data is collected as sampling at certain points and then mapped for the entire area. Collected data often has errors that causes uncertainties in accuracy of the outcome which may result in wrong decisions. Meteorological conditions such as rainfall, temperature, evapotranspiration varies horizontally, vertically and temporally. Horizontal variability is referred to the changes from one point to another without taking the elevation into account. Vertical variability is referred as the changes caused by height and temporal variability indicates the changing values at different times.

Since the amount of precipitation is not same at all geographic locations and varies with the elevation and temporally, it illustrates a good model of these variability. Geostatistics provide number of techniques for estimation and prediction of these variables at unrecorded (unsampled) locations. These prediction models can be deterministic or probability models. Thiessen polygons, Inverse distance interpolation and Splines are deterministic models while kriging (plain geostatistics), environmental correlation (e.g. regression-based), Bayesian-based models (e.g. Bayesian Maximum Entropy) are stochastic or probability models.

Literature review

Earlier studies related to rain gauge network design and optimization shows usage of different geostatistical methods for optimal rain gauge network design. Some other methods using information entropy are also presented. However, Most of these studies proposed variance reduction method which take into account number and location of gauge stations to achieve more accuracy and less cost and stations density. (Bastin et al., 1984) proposed kriging for accurate estimation of mean rainfall over a catchment. Results showed that the method can be used for the optimal location selection for the rain gauges by implementing the method in two river basins in Belgium.A method for selecting optimum locations for rain gauge stations in Makkah was applied by (Awadallah, 2012).

Kriging is used in ArcGIS for interpolation of the rainfall data and entropy principles are applied in R to determine possible optimum locations for rain gauge stations. (Anila and Vargheese, 2014) presented a method based of variance minimization to reduce the estimation error at the unrecorded locations. probabilistic technique (simulated annealing) is used in their method for optimal location selection. Using this method, new monitoring stations are added by simulated annealing optimization tool at the sites with large estimation variance and this step is repeated until the prediction error cannot be improved anymore. Sites with most noticable reduction of error are known as the desired locations for new stations. Likewise, (Aziz et al., 2015) presented a method using variance reduction combined with simulated annealing.

In addition, they suggested exponential semi variogram model as best fitted model by calculating mean error (ME), root mean square error (RMSE), average standardized error (ASE), mean standardized error (MSE) and root mean square standardized error (RMSS).(Shaghaghian and Abedini, 2013) claims to be the first who applied combined method of tools relevant to geostatistics, factor analysis and clustering. Based on this method, their study area (southwest of Iran) is divided to smaller regions sharing similar characteristics using factor analysis and block kriging and the rain gauges are ranked based on their variance in each region using point kriging. Their method is known suitable for application after use in study area. (Cheng et al., 2008) evaluated a network of 27 stations in northern Taiwan using ordinary kriging variances to assess the accuracy of rainfall estimation using the acceptance probability.

A sequential algorithm for prioritizing existing rain gauges is also proposed to define a base network. Results showed only 2/3 of the rain gauges provides same level of accuracy as the complete network. Likewise, (Shafiei et al., 2014b) in order to assess the accuracy terms of acceptance probability, a combination of probabilistic approach with a geographic information system (GIS) framework was applied and A simple equation for calculating the acceptance probability is presented which facilitates the application of the probabilistic approach in a GIS environment. This is a useful method to analyze the number and location of rain gauge stations and quantify the contribution of each station. Results shows only 21-gauge stations out of 33 significantly contributes to the accuracy and by applying an augmentation algorithm, an optimized network with 28 rain gauges is formed.

Software

The use of software depends on the method selected later for this research. Most likely software to be used are ArcGIS for visualization, interpolation and geostatistical mappings, MATLAB and some of MATLAB’s extensions if required. Python programming is also expected to be useful in case of atomization of the process for future use.

Workflow

Beginning step for this research work is to prepare the data either by searching for further data or processing available data for further improvement. As mentioned in the data part, requirement for necessary data varies based on the method used. So parallelly, a study and research are also necessary to find a method which will be applicable taking in to account the study area’s condition. Next steps depend on the chosen method for evaluation of the network. Overall, the main method for this research is to:a. Analyze the number and location of rain gauge stationsb. Quantify the contribution of each rain gauge to the entire accuracy of rainfall estimation over the study area. Relocate the rain gauges with less contribution to give better accuracy or to remove and add new rain gauges at best suitable locations for better accuracy.

Once the correct method is inquired and the data required for the method is prepared, the current rain gauge network is evaluated to find answers for the objectives. This can be done using different geostatistical methods which needs to be inquired. Most likely method is ordinary kriging to interpolate the point rainfall data over the study area which gives the estimation error. Once the location and number of current stations are assessed and their contribution is quantified depending on the used method, Further research is continued following the result of previous step. Methods like acceptance probability can be applied to find acceptance accuracy at ungauged points. For relocation or addition of new rain gauges, algorithms like simulated annealing and sequential algorithms defined by (Cheng et al., 2008) are suitable for use. It seems necessary for verification purposes to analyze the new augmented network at the end also, so a proof of result is available that shows the improved performance of the new augmented network.

References:

ANILA, P. & VARGHEESE, K. 2014. Raingauge Network Augmentation Based on Geostatistical Analysis and Simulated Annealing. International Journal of Scientific & Engineering Research, 5 (7).

AWADALLAH, A. G. 2012. Selecting optimum locations of rainfall stations using kriging and entropy. International Journal of Civil & Environmental Engineering IJCEE-IJENS, 12 (1), 36-41.

AZIZ, M. K. B. M., YUSOF, F., DAUD, Z. M., YUSOP, Z. & KASNO, M. A. Redesigning rain gauges network in Johor using geostatistics and simulated annealing. AIP Conference Proceedings, 2015. AIP, 270-277.

BASTIN, G., LORENT, B., DUQUE, C. & GEVERS, M. 1984. Optimal estimation of the average areal rainfall and optimal selection of rain gauge locations. Water Resources Research, 20 (4), 463-470.

BERNE, A. & KRAJEWSKI, W. F. 2013. Radar for hydrology: Unfulfilled promise or unrecognized potential? Advances in Water Resources, 51, 357-366.

CHENG, K. S., LIN, Y. C. & LIOU, J. J. 2008. Rain‐gauge network evaluation and augmentation using geostatistics. Hydrological Processes: An International Journal, 22 (14), 2554-2564.

GOOVAERTS, P. 1997. Geostatistics for Natural Resources Evaluation (Applied Geostatistics), New York, Oxford University Press.

GOOVAERTS, P. 2000. Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. Journal of Hydrology, 228, 113–129.

HENGL, T. 2009. A Practical Guide to Geostatistical Mapping, Amsterdam.

MORSY, M., KAMEL, A. M. & IBRAHIM, M. M. ESTIMATION OF YEARLY RAINFALL WATER AMOUNT OVER. 2017 Cairo. Al Azhar University.

PUTTHIVIDHYA, A. & TANAKA, K. 2012. Optimal rain gauge network design and spatial precipitation mapping based on geostatistical analysis from colocated elevation and humidity data. International Journal of Environmental Science and Development, 3 (2), 124.

SHAFIEI, M., GHAHRAMAN, B., SAGHAFIAN, B. & PANDE, S. 2014a. Assessment of rain-gauge networks using a probabilistic GIS based approach. Hydrology Research, 45 (4-5).

SHAFIEI, M., GHAHRAMAN, B., SAGHAFIAN, B., PANDE, S., GHARARI, S. & DAVARY, K. 2014b. Assessment of rain-gauge networks using a probabilistic GIS based approach. Hydrology Research, 45 (4-5), 551-562.

SHAGHAGHIAN, M. & ABEDINI, M. 2013. Rain gauge network design using coupled geostatistical and multivariate techniques. Scientia Iranica, 20 (2), 259-269.

Abstract

Rainfall data are very crucial for hydrological analysis, planning, engineering and management of water especially in areas where water scarcity prevents agricultural growth and is not enough for daily uses for humans. Rain gauge networks are a source of information for rainfall estimation over an area. Inappropriate position and density of the rain gauge stations brings the necessity of estimating rainfall data in unrecorded points and generalizing point data to regional data. Aim of this thesis is to evaluate performance of an existing rain gauge network in southern part of Sinai and identify changes necessary for improvement of current network by quantifying performance of the network and adding or relocating current rain gauges for optimization purposes. The research is conducted using geostatistical methods suitable with the current study area and scenario.

Introduction

Rainfall data is crucial for agriculture management, engineering designs for water control structures, flood control and water resource management.“Hydrologic analysis and designs rely significantly on measured rainfall data, including intense storms and flash floods identification based on high resolution estimates of spatial variability in precipitation map” (Putthividhya and Tanaka, 2012). The amount and intensity of rainfall varies by seasons and geographic location. To manage rainfall water, rainfall depth data is recorded in rain gauge stations which is used to determine the average rainfall in an area and helps classify and predict storms based on duration, return period and intensity.Rain gauge networks are designed to give estimation of spatial-temporal variation of rainfall over an area. Accuracy of the rainfall estimation in an area depends on density and locations of the gauge stations in the network. “The design of rain-gauge networks is motivated by the need to accurately capture the areal average rainfall in basins” (Shafiei et al., 2014a). Inappropriate position and density of the rain gauge stations brings the necessity of estimating rainfall data in unrecorded points and generalizing point data to regional data. Geostatistics is increasingly preferred because it allows one to capitalize on the spatial correlation between neighboring observations to predict attribute values at unsampled locations (Goovaerts, 2000). “Geostatistics is a branch of statistics that specializes in the analysis and interpretation of any spatially (and temporally) referenced data, but with a focus on inherently continuous features (spatial fields)” (Hengl, 2009).

Weather Radars provides better spatio-temporal resolution for rainfall data. Radar estimates can be biased (because of a bright band, for example). Such biases can lead to possibly large errors in hydrological simulation values (Berne and Krajewski, 2013). Since weather radars still do not cover whole world, rain gauge networks are still of importance to be evaluated. (Shafiei et al., 2014a).Rain gauge networks are the only source of rainfall data in some areas as in southern Sinai. Since Sinai Peninsula is in an arid region, it is facing severe water scarcity to an extent of preventing the development of animal life and growth of agriculture. Contrarily, flash floods are occurred due to heavy rainfall cause by synoptic situations that leads to severe damage (Morsy et al., 2017). Water scarcity in southern Sinai motivates the assessment of available rain gauge network to plan water resources and determine necessary changes required in the network for optimization purposes. The distribution of rainfall stations is not well for hydrological calculation therefore it will be needed to establish new stations using probability and statistical approach.

Study Area

Sinai Peninsula is the only Asian part of Egypt. It lays as a land bridge connecting Asia and Africa. It has an area of 61000km bordered with Mediterranean sea from north and Gulf of Suez and Gulf of Aqaba from southwest and southeast. It is linked to the African continent by the Isthmus of Suez, 125 kilometers wide strip of land, containing the Suez Canal. The eastern isthmus, linking it to the Asian mainland, is around 200 kilometers wide. It is considered as one of the coldest provinces in Egypt because of its mountainous topography with two famous mountains, Mount Sinai that rises 2285 meters and Mount Catherine whose summit reaches 2637 meters. The climate of Sinai that is generally hot and dry receives an average rainfall less that 100mm per year. Sinai Peninsula is classified as arid land (desert) since the surface water loss due to the evaporation exceeds the surface water gain from precipitation. As a result, there is no enough fresh water to support agriculture and human needs (Morsy et al., 2017).

Basic Background for use of Geostatistics

Measurement related to earth and environmental sciences has spatio-temporal reference which is determined by geographic location, elevation above the ground surface, time and spatio-temporal support. The science that provides methodological solution for spatio-temporal referenced data is known as spatio-temporal data analysis (STDA) (Hengl, 2009). STDA is roughly considered a combination of geoinformation and spatio-temporal statistics.“Geostatistics is a subset of statistics specialized in analysis and interpretation of geographically referenced data”(Goovaerts, 1997). Geostatistics science was originated from mining industry when classical statistics were known unsuitable for ore reservoir estimations. Since then, the use of Geostatistics has rapidly evolved not only in mining but to an extent of many other areas of geosciences. Environmental variables vary in spatial and temporal aspects and can belong to different domains e.g. biology, climatology, hydrology etc. Environmental variables are quantitative or descriptive measures of different environmental features(Hengl, 2009).

Considering the limited possibilities in the past and economical aspects, the variable’s data is collected as sampling at certain points and then mapped for the entire area. Collected data often has errors that causes uncertainties in accuracy of the outcome which may result in wrong decisions. Meteorological conditions such as rainfall, temperature, evapotranspiration varies horizontally, vertically and temporally. Horizontal variability is referred to the changes from one point to another without taking the elevation into account. Vertical variability is referred as the changes caused by height and temporal variability indicates the changing values at different times.

Since the amount of precipitation is not same at all geographic locations and varies with the elevation and temporally, it illustrates a good model of these variability. Geostatistics provide number of techniques for estimation and prediction of these variables at unrecorded (unsampled) locations. These prediction models can be deterministic or probability models. Thiessen polygons, Inverse distance interpolation and Splines are deterministic models while kriging (plain geostatistics), environmental correlation (e.g. regression-based), Bayesian-based models (e.g. Bayesian Maximum Entropy) are stochastic or probability models.

Literature review

Earlier studies related to rain gauge network design and optimization shows usage of different geostatistical methods for optimal rain gauge network design. Some other methods using information entropy are also presented. However, Most of these studies proposed variance reduction method which take into account number and location of gauge stations to achieve more accuracy and less cost and stations density. (Bastin et al., 1984) proposed kriging for accurate estimation of mean rainfall over a catchment. Results showed that the method can be used for the optimal location selection for the rain gauges by implementing the method in two river basins in Belgium.A method for selecting optimum locations for rain gauge stations in Makkah was applied by (Awadallah, 2012).

Kriging is used in ArcGIS for interpolation of the rainfall data and entropy principles are applied in R to determine possible optimum locations for rain gauge stations. (Anila and Vargheese, 2014) presented a method based of variance minimization to reduce the estimation error at the unrecorded locations. probabilistic technique (simulated annealing) is used in their method for optimal location selection. Using this method, new monitoring stations are added by simulated annealing optimization tool at the sites with large estimation variance and this step is repeated until the prediction error cannot be improved anymore. Sites with most noticable reduction of error are known as the desired locations for new stations. Likewise, (Aziz et al., 2015) presented a method using variance reduction combined with simulated annealing.

In addition, they suggested exponential semi variogram model as best fitted model by calculating mean error (ME), root mean square error (RMSE), average standardized error (ASE), mean standardized error (MSE) and root mean square standardized error (RMSS).(Shaghaghian and Abedini, 2013) claims to be the first who applied combined method of tools relevant to geostatistics, factor analysis and clustering. Based on this method, their study area (southwest of Iran) is divided to smaller regions sharing similar characteristics using factor analysis and block kriging and the rain gauges are ranked based on their variance in each region using point kriging. Their method is known suitable for application after use in study area. (Cheng et al., 2008) evaluated a network of 27 stations in northern Taiwan using ordinary kriging variances to assess the accuracy of rainfall estimation using the acceptance probability.

A sequential algorithm for prioritizing existing rain gauges is also proposed to define a base network. Results showed only 2/3 of the rain gauges provides same level of accuracy as the complete network. Likewise, (Shafiei et al., 2014b) in order to assess the accuracy terms of acceptance probability, a combination of probabilistic approach with a geographic information system (GIS) framework was applied and A simple equation for calculating the acceptance probability is presented which facilitates the application of the probabilistic approach in a GIS environment. This is a useful method to analyze the number and location of rain gauge stations and quantify the contribution of each station. Results shows only 21-gauge stations out of 33 significantly contributes to the accuracy and by applying an augmentation algorithm, an optimized network with 28 rain gauges is formed.

Software

The use of software depends on the method selected later for this research. Most likely software to be used are ArcGIS for visualization, interpolation and geostatistical mappings, MATLAB and some of MATLAB’s extensions if required. Python programming is also expected to be useful in case of atomization of the process for future use.

Workflow

Beginning step for this research work is to prepare the data either by searching for further data or processing available data for further improvement. As mentioned in the data part, requirement for necessary data varies based on the method used. So parallelly, a study and research are also necessary to find a method which will be applicable taking in to account the study area’s condition. Next steps depend on the chosen method for evaluation of the network. Overall, the main method for this research is to:a. Analyze the number and location of rain gauge stationsb. Quantify the contribution of each rain gauge to the entire accuracy of rainfall estimation over the study area. Relocate the rain gauges with less contribution to give better accuracy or to remove and add new rain gauges at best suitable locations for better accuracy.

Once the correct method is inquired and the data required for the method is prepared, the current rain gauge network is evaluated to find answers for the objectives. This can be done using different geostatistical methods which needs to be inquired. Most likely method is ordinary kriging to interpolate the point rainfall data over the study area which gives the estimation error. Once the location and number of current stations are assessed and their contribution is quantified depending on the used method, Further research is continued following the result of previous step. Methods like acceptance probability can be applied to find acceptance accuracy at ungauged points. For relocation or addition of new rain gauges, algorithms like simulated annealing and sequential algorithms defined by (Cheng et al., 2008) are suitable for use. It seems necessary for verification purposes to analyze the new augmented network at the end also, so a proof of result is available that shows the improved performance of the new augmented network.

References:

ANILA, P. & VARGHEESE, K. 2014. Raingauge Network Augmentation Based on Geostatistical Analysis and Simulated Annealing. International Journal of Scientific & Engineering Research, 5 (7).

AWADALLAH, A. G. 2012. Selecting optimum locations of rainfall stations using kriging and entropy. International Journal of Civil & Environmental Engineering IJCEE-IJENS, 12 (1), 36-41.

AZIZ, M. K. B. M., YUSOF, F., DAUD, Z. M., YUSOP, Z. & KASNO, M. A. Redesigning rain gauges network in Johor using geostatistics and simulated annealing. AIP Conference Proceedings, 2015. AIP, 270-277.

BASTIN, G., LORENT, B., DUQUE, C. & GEVERS, M. 1984. Optimal estimation of the average areal rainfall and optimal selection of rain gauge locations. Water Resources Research, 20 (4), 463-470.

BERNE, A. & KRAJEWSKI, W. F. 2013. Radar for hydrology: Unfulfilled promise or unrecognized potential? Advances in Water Resources, 51, 357-366.

CHENG, K. S., LIN, Y. C. & LIOU, J. J. 2008. Rain‐gauge network evaluation and augmentation using geostatistics. Hydrological Processes: An International Journal, 22 (14), 2554-2564.

GOOVAERTS, P. 1997. Geostatistics for Natural Resources Evaluation (Applied Geostatistics), New York, Oxford University Press.

GOOVAERTS, P. 2000. Geostatistical approaches for incorporating elevation into the spatial interpolation of rainfall. Journal of Hydrology, 228, 113–129.

HENGL, T. 2009. A Practical Guide to Geostatistical Mapping, Amsterdam.

MORSY, M., KAMEL, A. M. & IBRAHIM, M. M. ESTIMATION OF YEARLY RAINFALL WATER AMOUNT OVER. 2017 Cairo. Al Azhar University.

PUTTHIVIDHYA, A. & TANAKA, K. 2012. Optimal rain gauge network design and spatial precipitation mapping based on geostatistical analysis from colocated elevation and humidity data. International Journal of Environmental Science and Development, 3 (2), 124.

SHAFIEI, M., GHAHRAMAN, B., SAGHAFIAN, B. & PANDE, S. 2014a. Assessment of rain-gauge networks using a probabilistic GIS based approach. Hydrology Research, 45 (4-5).

SHAFIEI, M., GHAHRAMAN, B., SAGHAFIAN, B., PANDE, S., GHARARI, S. & DAVARY, K. 2014b. Assessment of rain-gauge networks using a probabilistic GIS based approach. Hydrology Research, 45 (4-5), 551-562.

SHAGHAGHIAN, M. & ABEDINI, M. 2013. Rain gauge network design using coupled geostatistical and multivariate techniques. Scientia Iranica, 20 (2), 259-269.