Τίτλος: Χωροχρονική Ανάλυση Μετεωρολογικών Παραμέτρων με Κλασικές Γεωστατιστικές Μεθόδους και Μεθόδους Μηχανικής Μάθησης.
Title: Space-time analysis of meteorological parameters with geostatistical and machine learning methods.
Επταμελής Εξεταστική Επιτροπή:
1. Καθηγητής Διονύσιος Χριστόπουλος (επιβλέπων), Σχολή ΗΜΜΥ
2. Καθηγητής Γεώργιος Καρατζάς (Σχολή ΧΗΜΗΠΕΡ Π.Κ)
3. Καθηγητής Παναγιώτης Παρτσινέβελος (Σχολή ΜΗΧΟΠ Π.Κ.)
4. Καθηγητής Νικόλαος Νικολαΐδης (Σχολή ΧΗΜΗΠΕΡ Π.Κ.)
5. Αναπληρώτρια Καθηγήτρια Αναστασία Μπαξεβάνη (Τμήμα Μαθηματικών και Στατιστικής, Πανεπιστήμιο Κύπρου)
6. Αναπληρωτής Καθηγητής, Τρύφωνας Δάρας (Σχολή ΧΗΜΗΠΕΡ Π.Κ.)
7. Επ. Καθηγητής Εμμανουήλ Βαρουχάκης, (Σχολή ΜΗΧΟΠ Π.Κ.)
Technological advancements have increased the availability of spatiotemporal data. However, meteorological data are usually non-Gaussian and correlated in space and time. In this dissertation, state-of-the-art geostatistical and machine-learning methodologies were utilized to analyze large-scale non-Gaussian meteorological space-time data. We carried out a series of numerical investigations utilizing 26 surface variables from the ERA5 reanalysis data sets collected for 65 grid locations on the island of Crete, Greece. The data sets correspond to multiple temporal scales (hourly to annually) and span the period from 1979 until 2019.
Four distinct approaches were implemented for the analysis of the meteorological parameters:
- The ERA5 data set was used for the estimation of the standardized precipitation index (SPI) and the standardized precipitation evapotranspiration index (SPEI) to reveal the spatiotemporal patterns of drought in Crete.
- Gaussian Anamorphosis with Hermite polynomials (GAH) was employed to transform non-Gaussian precipitation data into normally distributed variables. Ten processing scenarios were investigated and their performance with respect to spatial interpolation (based on Ordinary kriging) was evaluated. The scenarios include the application or exclusion of GAH with varying polynomial degrees, the utilization of either the exponential or Spartan variogram models, and the incorporation or omission of Monte Carlo simulations.
- Twelve machine learning (ML) techniques were compared for the classification of precipitation data into eight classes. Twenty-six (26) numerical and categorical variables were used in a spatiotemporal predictive framework for precipitation. Due to pronounce class imbalance (dominance of ``no rain'' events), we first divided the data into two classes that represent the absence or occurrence of precipitation events. Then, the occurrence data set was split in five different classes to characterize the intensity of precipitation events.
- Finally, we applied the Stochastic Local Interaction (SLI) model to perform temporal (precipitation, temperature and solar radiation) and spatiotemporal (precipitation and temperature) estimation of missing values (data gaps).
The most important conclusions derived in this dissertation are as follows:
- The dry climate of Crete was confirmed by the estimation of the SPI and SPEI drought indices. It was found that the eastern part of the island is more prone to desertification than the north-western part. Moreover, a temperature-inclusive drought index was shown to be more appropriate than a purely precipitation-based index for the study area.
- Using higher-order (35 versus 20) polynomials in GAH has little effect on the cross-validation results for the monthly total precipitation data. In addition, the incorporation of Monte Carlo simulations does not universally improve the statistical measures.
- With respect to the classification of hourly precipitation data, the method of Random Forests (Bagged Ensemble Trees) performs best for both the ``Binary'' (``rain'' versus ``no rain'') and the ``Only Rain'' classification cases.
- SLI is a competitive method for interpolating large temporal and spatiotemporal data since it is fast and it performed very well (compared to nearest-neighbor interpolation) across all the different hourly data sets (temperature, precipitation, and solar radiation).
The present study investigates a variety of methodological approaches for the analysis of non-Gaussian, large-scale meteorological variables. It provides an extensive analysis of precipitation, temperature, and solar radiation for the island of Crete using the ERA5 reanalysis data set. The meteorological data used involve multiple timescales. Two drought indices are evaluated and compared in order to assess the effect of warming trends on drought events. Various data processing scenarios that combine GAH, kriging interpolation and bootstrapping are studied and assessed. In addition, a comparison of twelve machine learning methods for the classification of precipitation data supported by 26 meteorological variables is conducted. Lastly, the computationally efficient SLI models are herein applied for the first time to spatiotemporal precipitation and solar radiation data.