A Study of the Homogeneity of Climatic Data for Rain, Temperature and Humidity for Nineveh Governorate

In recent years, climatic changes have had a greater impact on the hydrological cycle, leading to continuous changes in climate on both temporal and spatial scales. Therefore, this study aimed to verify the credibility and homogeneity of the data, so when conducting any study in the field of climate and hydrological change, the homogeneity of the data used must be tested. In the current study, eight climatic stations distributed in Nineveh Governorate were selected, using climatic data represented by (rainfall, maximum and minimum temperatures, and maximum and minimum humidity) for the time period 1990-2020. Four statistical methods were used, namely, Von Neumann test (VONT), Standard Normal Homogeneity test (SNHT), Buishand test (BRT) and the Pettitt test at a significance level of 5%. The results showed that the monthly rainfall was homogeneous for all stations except for three months (2, 2, 11) for the stations of Tal-Abta, Ba’aj, and Al-Sheikhan. As for the temperature and humidity, they were heterogeneous for most of the stations, as the percentages for months that were heterogeneous in temperature reached 35% and 42% for the maximum and minimum, respectively. As for the humidity, the percentage of the heterogeneous months were 18% and 14% for the maximum and minimum, respectively. The study showed that the SNHT and VON tests are the most sensitive to the breakpoint and the Pettitt test is the least sensitive in most tests. The heterogeneous climatic data were also corrected by using the double mass curve method and converted into homogeneous climatic data.


INTRODUCTION
The hydrological cycle is impacted by climate change through various factors such as rising temperatures, increased evaporation rates, and the atmosphere's ability to hold moisture.These climate fluctuations also lead to extreme events that have significant social and economic consequences.As a result, precise and dependable climate data is crucial for climate assessment, modeling, and forecasting.Homogeneous time series of climate data are particularly vital for hydrological and climatic studies.Accordingly, this data must be subjected to a homogeneity test before each study in the fields of water resources, hydrology, and climate change, so that the scientific findings are correct.The changes occurring in the time series, whether in their rates or variations, may be due to the transfer of the station, the monitoring methods, and the techniques used in measurement.These factors can potentially lead us to draw incorrect conclusions [1].There is no doubt that a heterogeneous data set does not give reliable results in statistical analyses, therefore, it is necessary to ensure the homogeneity of the time series by doing some statistical methods to obtain accurate data for climate studies and weather forecasts [2].The relative method was relied upon to find the missing value, and this method gives good results due to its reliance on neighboring stations, especially in the case of a large correlation between the candidate station and a sufficient number of neighboring stations [3].
There are many studies conducted in the Middle East, including: [4] It was found that examining the reliability of the annual rainfall series in Iraq for the time period from 1981 to 2010 for 36 stations using SNHT, BR, Pettitt and VON tests.The results showed that 70% of the stations were homogeneous at a significance level of 5%, and the results of only two stations were doubtful and three stations were rejected.In Turkey, [5] conducted their study based on the SNHT, Pettitt and Swed-Eisenhart tests to detect heterogeneity in the average annual temperatures of data taken from 267 meteorological stations for the time period 1968-1998.After substituting the missing values and running the tests, the researchers noted that the SNHT and Pettitt tests are more sensitive in identifying heterogeneity.In a study conducted by the two researchers [1] to show the reliability of the climatic data of the monthly rain series for 160 stations in Turkey for the period from 1974 to 2014 by adopting the four homogeneity tests, the researchers found that 5 (44) stations were rejected (non-homogeneous) out of 160 stations.The researchers also revealed the percentage of heterogeneity was (16%, 8%, 14%, 16%) for the (SNHT, BR, VON, Pettitt) tests, respectively.The researchers [6] relied on the four homogeneity tests to detect heterogeneity in the annual rain series of 20 meteorological stations in Iraq for the period 1981-2010, and the test results showed that 5% of the stations are doubtful, 45% are suspect, and 50% are useful, and the researchers showed that the years 1998 and 1999 constitute 28% of the years of breaking and 21% for the year 1997, while the percentage of 51% of the years of breaking was found between the years 1991-2004.Also in Iraq, the researchers [7] conducted their study on the monthly rain series for 13 meteorological stations for the period 1970-2010 using the (pettitt and BR) tests, the researchers concluded that there is homogeneity in most of the stations, in addition to not discovering the breakpoint in the pettitt test, while the Buishand test indicated that there is a breakpoint only in the Karbala station in March 1998, The researchers conducted a correction process for the heterogeneous data using the Double Mass Curve method.[8] concluded after conducting an examination of the four homogeneity tests of the rain series for 18 meteorological stations in Iraq for the period from 1981 to 2018, that only two of the stations were doubtful and 16 stations were useful.Also, [9] showed that after they conducted the four homogeneity tests (VON, Pettitt, Buishand and SNHT) of the rain series and the monthly and annual temperatures at a significance level of 5% for 9 stations within Kurdistan, Iraq for the period from 1981 to 2020, the results of the tests for the monthly rain series were homogeneous in most of the stations.As for the monthly temperatures, most of the stations were heterogeneous.
The study aims to detect the homogeneity characteristics of the monthly climatic data series for the time period 1990-2020 for selected meteorological stations in Nineveh Governorate/ northern Iraq using four Von Neumann ratio test, BR, Pettitt, SNHT tests.As the main reason for conducting this research is due to the need to find reliability in rainfall data, maximum and minimum temperatures, and maximum and minimum humidity for the Nineveh Governorate, as was mentioned above, most researchers relied in their studies on annual or seasonal time series in conducting the homogeneity test.While the current study will rely on monthly data, it will also involve testing other climatic factors, including maximum and minimum temperature, as well as maximum and minimum humidity.

Study location and climatic data:
Nineveh Governorate is located in the northwest of Iraq, with an area of (32,308 square kilometers) between longitudes (41° 25' 44° 15') and latitude (34° 15' 37° 30'), as shown in Figure 1, this region is characterized by climatic diversity.
Monthly climatic data on rainfall were obtained for eight meteorological stations, six of which (Mosul, Rabia, Tal Afar, Sinjar, Tal Abta, Ba'aj) were obtained from the General Authority for Meteorology and Seismic Monitoring/Iraqi Ministry of Transportation, and two (Sheikhan Al-Hamdaniya) were from the Nineveh Agriculture Directorate.The other climatic data represented by the maximum and minimum temperatures and the maximum and minimum humidity for six meteorological stations (Mosul, Rabia, Tal Afar, Sinjar, Tal Abta, Ba'aj), were also obtained from the General Authority for Meteorology and Seismic Monitoring.

Estimate the missing climate data
Hydrological studies require a long and complete set of climatic data for a successful study.The climate data record of the meteorological stations in the study area (Nineveh Governorate) contains missing data, as shown in Table (1).Data loss in time series is the most common problem in scientific studies.Therefore, it is necessary to estimate these missing data by following statistical methods to complete the scientific study.In this study, the Arithmetic Mean and Normal Ratio Methods were used:  Note: T: temperature, RH: humidity.

Arithmetic Mean Method
This method is used to find the missing value when the annual rainfall rate in the neighboring stations is within the limits of 10% of the annual rainfall rate in the station where the missing values are to be found.It is calculated by the following equation [9]: ……. (1) where: px: the missing value to be estimated at station (x).Pi: Rainfall value measured at nearby stations.M: the number of nearby stations.

Normal Ratio Method
This method is adopted when the annual rainfall rate in the neighboring station exceeds 10% of the annual rainfall rate in the station in which the missing values are to be calculated, according to the following equation [9]: where: px: the missing value to be estimated at station (x).Pi: Rainfall value measured at nearby stations.m: the number of nearby stations.Nx: annual rainfall rate at the estimated station.Ni: annual rainfall rate at the nearby station.

Homogeneity test
In this study, we relied on four widely used homogeneity tests on climatic data, which are Buishand, Von Neumann, SNHT and Pettitt at a significance level of 0.05.As the records of rainfall, maximum and minimum temperatures, as well as long-term maximum and minimum humidity for 31 years were analyzed.One of the most important characteristics of these tests is that they complement each other.As the Buishand, SNHT and Pettitt tests examine whether there is a leap in the time series by defining the breakpoint.The Von Neumann test assumes that the series is not randomly dispersed, and it does not give information about homogeneity.These tests are based on the Null Hypothesis (H0) and the Alternative Hypothesis (H1).

Pettitt Test
This method is non-parametric, it was adopted by the scientist [10] to discover the point of change in the middle of the time series on a monthly or annual scale.This test is based on the null hypothesis, which considers that the data are independent and randomly distributed, and this means that the data follow the same distribution.As for the alternative hypothesis, it shows us that a sudden change has occurred.The following is the methodology used in this test: 1.The observations (X) are rank from 1 to N (i.e.X1, X2… XN) 2. The value of Vi is estimated from: Ri is the rank of Xi in the sample of N observations.…….(7) The null hypothesis is rejected when the value of POA is less than α, where α is the significance level.

Standard Normal Homogeneity Test (SNHT)
This test, which was developed by [11], is considered one of the most important homogeneity tests that are used frequently in climate studies.It is a flexible and easy-to-use method and is similar to the Pettitt test in the null hypothesis.It detects the change at the beginning or end of the time series.It is calculated as follows: where: If the break is located at point K, T(k) reaches its maximum value at k = K, then to find the value of T0 from the following equation: The test statistic, Tc was calculated: Where  ̅ : Arithmetic mean of the Yi value.s: Standard deviation.
The null hypothesis is rejected when the value of T0 is greater than the critical value.

Buishand test
In the Buishand test [12], it is assumed that the data are normally distributed, that is, the data distribution is independent and random according to the null hypothesis.This test is sensitive to intervals in the middle of the time series.It is calculated as follows: , K=1,2,……,N. .(13) Where: Xi : time series values.
x ̅ : series data rate.K: The Number of times breakpoints has occurred.
The null hypothesis is accepted when the value of (Q/√) is less than the critical value [12].

Von Neumann ratio test
This test is based on the null hypothesis [13].It differs in that it considers that the data are not randomly distributed and according to the alternative hypothesis, the time series is randomly distributed.In addition to the test's inability to determine the point where the homogeneity is weak, i.e., the breakpoint.In this test, VON describes the mean squared proportion of the difference of the variance in a series.

Homogeneity test classification
According to [15], the results obtained from the four tests are classified into three categories: 1. Useful: zero or one test of the null hypothesis are rejected at a significance level of 5%.
2. Doubtful: two tests of the null hypothesis are rejected at a significance level of 5%.
3. Suspect: three or four tests of the null hypothesis are rejected at a significance level of 5%.

Double mass curve analysis
The double mass curve method is used to convert inhomogeneity time series data, that is, doubtful or rejected data, into homogeneous data.It should be noted that errors in climate data occur randomly or systematically.As the error is considered random in the case that the reading is taken incorrectly.As for the occurrence of a systematic error, is due to a change in the location of the station or a misunderstanding of the measuring instrument [16].
This technique is widely used to detect and correct inhomogeneity time series and convert them into homogeneous series that have one slope only.The work of the double mass curve method is based on correcting the time series in the case that more than one slope appears.As it is adjusting the data before the breakpoint with the data after the breakpoint by multiplying or dividing it by the slope ratio.The climatic data is corrected by calculating the cumulative rainfall of the data of neighboring stations (∑ av) and also calculating the cumulative total of the station to be corrected (∑ x) which is calculated through the following relationship [9]: Where: Mc: Adjusted climatic data.Ma: Observed climatic data to be corrected.
c: the slope of the curve to be Adjusted a: slope of the curve at the observed data.

Results and discussion
In this study, the homogeneity test was conducted after finding the missing values of the monthly climatic data (rainfall, maximum temperature, minimum temperature, maximum humidity, minimum humidity) for selected stations in Nineveh Governorate / northern Iraq, for the period from 1990 to 2020.The monthly time series was examined by adopting four homogeneity tests (VON, Buishand, SNHT, Pettitt).The results of the homogeneity test of the monthly rain series showed the presence of homogeneity in most of the stations, so the stations (Mosul, Tal Afar, Sinjar, Rabia and Al-Hamdaniya) were homogeneous for all months of the year except for the other stations, which were inheterogeneous for the months 2, 2, 11 at Tal Abta, Ba'aj, Al-Shekhan stations are doubtful, as shown in table (2).This agrees with the results obtained in previous studies [8] of the homogeneity of data in Mosul and Sinjar stations, with the exception of Tal Afar for the month of May, this is due to the difference in the time series, while there is clear agreement through the results obtained by [3].
As for the homogeneity test of the monthly temperatures, the maximum temperatures were heterogeneous in most stations.As the percentage of heterogeneous months reached 35%, of which 76% were doubtful and 24% were suspected, based on this, the percentage of heterogeneity was (8%, 32%, 28%, 32%) for the months of winter, spring, summer and autumn, the SNHT test was the most sensitive in detecting the turning point, shown in the table (3).As for the minimum temperatures, it was noted that there is a significant change in the time series of most stations, as the results showed a rejection of the data, especially at the Mosul station, and the percentage of heterogeneous months was 42%, of which 84% were rejected and 16% were doubtful.The seasonal heterogeneity in monthly data was evident, with percentages of (6%, 27%, 37%, 30%) observed for winter, spring, summer, and autumn months respectively, as displayed in Table (3).
The results of the homogeneity examination of the maximum monthly humidity were doubtful for a few months.As it reached 62%, which is equivalent to 8 months out of 13 heterogeneous months.As for the minimum humidity, the results showed that Tal Abta station had useful data, and that the number of heterogeneous months was very small, as it reached 10 months out of 72 months and the percentage of doubtful months reached 80%.Likewise, the summer months were the most heterogeneous, as they reached 40% of the heterogeneous months, and the Buishand test is the most revealing of the turning point in the data of the maximum and minimum humidity, as shown in the tables (4,5).
In the end, the tests were repeated on the heterogeneous months (doubtful and suspect) after making an adjustment to them by following the double mass curve analysis, and the results showed the homogeneity of all the modified months.Figure (2) shows models of the four homogeneity tests for the time series before and after the correction, for example, at Al-Sheikhan station for the month of November, the rate (mu1=118.5,mu2=43.7)for the rain series for the year of stopping 1994 before the correction, and the average after the correction became (mu=53.4) the average maximum temperature of Sinjar station in June before the correction was (mu1=31.5, mu2=32.5)for the year of stopping 2015, and after the correction it became equal to 32.17.As for the minimum temperature in Mosul station for the month of April, the average was (mu1=10.3,mu2=12.2) for the year of stopping 1999, and after correction it became equal to 11.7.As for the maximum humidity at the Ba'aj station for the month of July, the year of stopping for the SNHT test was 2014, with an average of (mu1=66.7,mu2=46.93).Finally, the average minimum humidity at Tal Afar station for the month of May and for the year of stopping 1998 was (mu1=12.1, mu2=6.7)and after correction it became equal to 7.2.
Finally, this paper holds significant value for meteorological and hydrological researchers.The utilization of dependable climate data in studies leads to precise and credible scientific outcomes that aid in evaluating climate fluctuations.The Nineveh Governorate is recognized as one of the most impacted areas in Iraq due to the surge in water requirements caused by population expansion, economic progress, and intensive farming practices.Furthermore, the recent high temperatures have also affected the hydrological cycle, making this study even more relevant.

Conclusions
In this paper, the homogeneity test was applied to eight meteorological stations in Nineveh Governorate during the periods from 1990 to 2020 based on climate factors (rainfall, temperature and humidity).The four tests SNHT, Pettitt, Buishand and Von Neumann were used at the 5% significance level to detect inheterogeneity and breakpoints.The following conclusions were drawn from the results achieved: 1.The results of the tests for the monthly rainfall series showed homogeneity of the data, especially for the stations (Mosul, Rabia, Tal Afar, Sinjar, and Al-Hamdaniya) and heterogeneity for very limited months for the rest of the stations.
2. Regarding the results of the tests for temperature and humidity, it was found that there is heterogeneity in the temperature data for most of the stations.As a result, many months exhibited a state of heterogeneity (doubtful, suspect), especially Mosul station.As for the humidity, the results were more reliable than the temperatures, as heterogeneity appeared for a few months, including the summer months, at almost all stations.
3. The results of the tests showed that VON and SNHT were the most sensitive in detecting heterogeneity in the monthly climatic data, although the VON test does not detect the turning point.4. The findings indicated that the Pettitt test exhibits lower sensitivity in identifying heterogeneity when compared to other tests.

5.
The SNHT is more effective in detecting breaks at the start or end of a time series than other tests, and these results are consistent with the study they were conducted by [3].
6.It also shows us that the results of the homogeneity test of the monthly rains series are more objective and reliable than the homogeneity test of the seasonal time series, because this leads to the emergence of heterogeneity in the examination more accurately.Note: bold refers to inhomogeneity at the 5% significance level.
Note: bold refers to inhomogeneity at the 5% significance level.

Fig. 1
Fig.1 Locations of meteorological stations in Nineveh Governorate

Fig. 2 :
Fig.2: Homogeneity test before and after correction in double mass curve for five climatic elements

Table 1 :
Percentage of missing data in monthly time series for climate factors in Nineveh weather stations.

Table 2 :
The homogeneity test results for the rain series.Note: bold refers to inhomogeneity at the 5% significance level.

Table 3 :
The homogeneity test results for the maximum temperature series.Note: bold refers to Inhomogeneity at the 5% significance level.Note: bold refers to inhomogeneity at the 5% significance level.

Table 4 :
The homogeneity test results for the minimum temperature series.

Table 6 :
The homogeneity test results for the minimum humidity series

Table 5 :
The homogeneity test results for the maximum humidity series.