Analysis of Rainfall Data for a Number of Stations in Northern Iraq

This study aimed to analyze the rainfall time series data for 9 stations in the north of Iraq, namely (Sulaymaniyah, Darbandikhan, Dokan, Duhok, Erbil, Kirkuk, Mosul, Sinjar, Tal-Afar) from 1979 until 2014 using the Multiplicative model of time series, this is for the purpose of identifying the reality of the time series variables (general trend, seasonal, periodic, random) which affected by the rains, and the prediction of monthly rainfall data for the selected stations. The rainfall characteristics of the northern region of Iraq were also studied in terms of daily, monthly and annual distribution of rainfall rates and a relationship between intensity-Duration-Frequency (IDF) at different return periods (2, 5, 10, 15, 25, 50, 100) years has been found to identify the reality of rainstorms and provide solutions to a number of design problems related to management and treatment of water for Basins in general and ungauged Basins in particular such as surface runoff discharge and erosion control.


INTRODUCTION
The importance of metrological information about rainfall data is represented in transformation of these rainfall data into a surface and subsurface flow that feeds the surface water and groundwater, Whereas the design and operation of water resources projects depends mainly on prediction of incoming water which is affected by rainfall. With expanding the interest in water resources, there was a need for hydrological and hydrogeological information. Information is now considered as the basis of any scientific research that attempts to develop ideal solutions to water problems in any part of the world. Thus, information is the main element of right decision making at the executive and administrative levels.
Each of the economic and administrative planning depends on the study and analysis of time series because studying many phenomena for a number of years or months helps to know the nature of the changes that have occurred and to predict what will occur in the future. The time series can be defined as a set of observations arranged according to their occurrence in time such as years, seasons, months, days or any unit of time. Rainfall is a principal element of the hydrological cycle, therefore studying the rainfall time series yields important information for understanding the climate. Time Series Analysis is an important tool for obtaining information about the components of the analyzed data, Several studies on the analysis of rainfall time series have been carried out, For example, [1] identified the periodic component and trend of monthly rainfall time series for Jorhat region in Northeast India.
[2] considered seasonal and periodic time series models for statistical analysis of rainfall data of Punjab, India.
The rainfall intensity-duration-frequency (IDF) relationship diagram plays an important role in water resources engineering and management. The IDF relationship diagrams are widely used in the assessing rainfall events, Al-Rafidain Engineering Journal (AREJ) Vol.25, No.2, December 2020, pp. 105-117 classifying climatic regimes, assistance in the design of urban drainage systems and the derivation of design storms, etc. [3]. Ombadi et al, applied a new methodology to develop IDF curves in ungauged regions [4]. De Paola et al, presented a study to evaluate the IDF curves and analysis of the rainfall pattern for three cities: Addis Ababa (Ethiopia), Dar Al Salaam(Tanzania) and Douala (Cameroon) [5]. Hamaamin was generated the IDF curves from daily rainfall data to predict rainfall intensity for Sulaimani City, The results showed a good match between the rainfall intensity using the empirical formula compared to the IDF curves [6]. IDF curves were build by Jalut for the Kirkuk station utilizing new disaggregation method to estimate rainfall intensity for various return periods [7]. Hussain proposed a modified relations to estimate design rainfall intensity for Baiji station by using (IDF) curves they found that the rainfall intensity increases with the increasing of return period. [8].
The lack of adequate metrological information about the characteristics and pattern of the rainfall time series in northern Iraq and the impact of its four components which needed by Hydrologists in the estimation of surface runoff and sediment yield and in hydraulic designs, therefore; it is necessary to study the characteristics of rainfall time series and find the relationship diagram between Intensity-Duration-Frequency at different return periods. This study aims to analyze the rainfall situation for 9 stations in northern Iraq by: Analysis of the general trend of the monthly rainfall data; Determine the variables of the time series (seasonal, periodic and random) affected by the monthly rainfall; Prediction of the monthly rainfall time series using the multi-series fracture method; and study the characteristics of rainfall in terms of daily, monthly and annual distribution of rainfall and find relationship diagram of Intensity-Duration-Frequency (IDF) at different return periods of (2, 5, 10, 15, 25, 50 and 100) years to provide solutions for hydrologists to a number of design problems that are related to the management and treatment of water resources for basins in general and ungagged basins in particular such as surface runoff discharge and erosion control.
The importance of the study follows the importance of the subject to be treated. Rainfall is considered an important water resource for hydrologists and mediator between them and designers. Quantitative analysis of the monthly rainfall time series data contributes to the diagnosis of the reasons that help hydrologists analyze the current and future rainfall situation, which is useful in making appropriate decisions in hydrological estimations, hydraulic designs.

STUDY AREA
The study area is located in the northern part of Iraq and is bounded between (41 o -46 o ) E longitudes and (35 o -38 o ) N latitudes. The region includes five governorates and nine selected stations which are (Sulaymaniyah, Darbandikhan and Dokan in Sulaymaniyah governorate, Kirkuk in Kirkuk governorate, Erbil in Erbil governorate, Duhok in Duhok governorate, Mosul, Tal-Afar and Sinjar in Nineveh governorate). The total selected study area is about 79947 km 2 . The area is surrounded by a mountain range from the north and east and the lowlands from the west and the south inside Iraq. Mean annual precipitation ranges between 350 and 1000 mm. The study area is the largest part of the catchment area of Upper Zab River, Lower Zab River, Adhaim River and Diyala River.

TIME SERIES ANALYSIS
Time series analysis is an interesting method, which has evolved extremely and can be used for predicting the future. The process of analyzing the time series means breaking the data to identify its four components and then directing it forward. The process of identifying the four components of the time series is called data decomposition. These four components include [9,10] Multiplicative Model was adopted in this study because the mean and standard deviation of the series were unstable in the unit of time, in case of stability, the Additive model is taken into account [10]. The stability of rainfall time series data were tested using Augmented Dickey and Fuller ADF test at significance level of (0.025).
In light of the aims of the study, the following hypotheses were formulated: Rainfall time series in northern Iraq is affected by 1-General trend. 2-seasonal changes. 3periodic changes and 4-random variables. In order to test the hypotheses, time series analysis method was used to obtain a model for estimating changes in monthly rainfall data at the selected stations. This is done by removing the effects of the factors (General trend, seasonal, periodic and random) on the time series. Data analysis and hypotheses testing were carried out as follows: First hypothesis (Rainfall time series are affected by the general trend), to test it the general trend T variable of equation 1 was analyzed and the effect of seasonal S, periodic C and random I changes was excluded. Least Squared Method was used to determine the trend of time series using SPSS software. The test results were as follows: 1. The values of R 2 for the trend values of the all selected stations are smaller than (0.009), which means that the general trend contributes by less than (0.9%) of the phenomenon value, a contribution is very weak. 2. The (F) ratio ranged between (0.013 -5.34) for all stations with a significance level (P > 0.05), this indicates that the mean of monthly rainfall time series gives a better prediction than the linear regression model of the general trend. 3. The values of general trend line slope ranged from (-0.001) to (-0.039) for the all stations, with a significance level (P > 0.05), this indicates that the general trend does not reflect a real impact on the value of the phenomenon.
Based on the above results, the first hypothesis is rejected; meaning that rainfall in northern Iraq is not affected by the general trend of the time series. Second hypothesis (Rainfall time series are affected by seasonal variations), these variables are determined by seasonal index which calculated as follows: A) The Centered Moving Average is calculated to obtain accurate calculations of seasonal index for a 12 months. B) The seasonal ratio for each month is calculated by dividing the historical series data for that month by the central moving average from step A. C) The seasonal index is calculated for each month of the year as an average of all seasonal ratios from step B for that month and for all years, as shown in Table 1. D) to calculate the percentage of the seasonal index, the seasonal index for each month from step C is multiplied by (100%). If the percentage of seasonal index for one of the months or seasons is (96%) indicates that this month or season leads to a decrease in the value of the phenomenon for that month by (4%). If the percentage of seasonal index for one of the months or seasons (108%) indicates that this month or season leads to an increase in the value of the phenomenon for that month by (8%).
Al-Rafidain Engineering Journal (AREJ) Vol.25, No.2, December 2020, pp. 105-117  , and then the seasonal index returns to oscillation and gradual increase in the (10,11,12) months. These changes in seasonal index values can be explained by the difference in the degree of solar radiation from month to month. The fluctuation in the seasonal index during the months of the year refers to the effect of the seasonal changes on the monthly rainfall series. Accordingly, the second hypothesis is accepted. Rainfall in northern Iraq is affected by seasonal changes. Walsh and Lawler formula were also applied in order to calculate the seasonal index values, The results for the nine stations using this formula are between (0.6-0.79), which according to the Walsh and Lawler are classified as affected by seasonality. Third hypothesis (Rainfall affected by periodic changes) is tested after excluding trend, seasonal and random variations according to the model used in equation 1. the original data is divided by (S*T) so it produces (C*I) periodic and random changes. Using a 3-month moving average, random changes I can be excluded then only the periodic changes C stay [13].
The results showed that the highest percentage of the periodic variable of the nine stations in northern Iraq (920%) and the lowest percentage (4.7%), indicating that there is a variation in this ratio, but the effect of this variation appears in a short period does not exceeding 4 consecutive years and is in average during this period (1.1 -1.14) and more than this period becomes (1) which has no effect. Therefore, third hypothesis is accepted for a period not exceeding 4 consecutive years. When this period is 5 years or more, the third hypothesis is rejected.
There is no test for the fourth hypothesis. Therefore, it is rejected, rainfall is not affected by random variables.

RESULTS AND DISCUSSIONS 4.1. Climate Classification
Koppen classification was used to classify the Climate, Koppen identified drought when the rainfall depth is twice the temperature according to the following equation [14]: where r: total annual rainfall (cm) and t: mean annual temperature (C o ). If (2t > r) the station is semi-dry and is given by symbol (BSh), and If (2t < r) the station is semi-humid and is given by symbol (CSa) as shown in Table 2:  To find the value of the phenomenon in any month, the value of (A) for each station is multiplied by the value of (S) from table 1 for that month, Rain depth (mm) is produced for that month, as shown in Table 3:

Correct the forecasting results
The monthly rainfall values for the selected stations shown in Table 3 can be corrected to fit the change or the little fluctuation in the annual rainfall in any year by multiplying the monthly or annual rainfall in Table 3  The five-years forecast of rainfall time series for all stations from 2006 to 2010 was carried out using the results in Table 3 with the monthly wetness index and annual wetness index. In the case of monthly very accurate results were obtained compared to the observed time series, where the values of R 2 ranged between 0.997 and 0.999. While in the annual case the R 2 values ranged between 0.71 and 0.78 for the all stations.
Thus, the use of the monthly Wetness Index is accurate and much better than using the annual Wetness Index to correct the forecast results of the monthly rainfall data in Table 3. The value of the R 2 of the data in Table 3 after correction compared to the actual data of the monthly rainfall depths for the all stations was between (0.992 -0.996) indicating the effectiveness of the modified method (equation 6) used in this research in prediction of the monthly rainfall.
The Geographical Information Systems (GIS) was used for drawing the digital maps which show the contour lines for distribution of monthly and annual rainfall rates for the study area, as showen in

Relationship between Intensity-Duration-Frequency
Rainfall data measured directly in design uses are rarely used, but instead, statistical data for measured rainfall, which is often expressed in the Intensity-Duration-Frequency relationship diagram is used [15], this diagram represents the relationship between rainfall intensity of rainstorm and the average storm time (Duration) at a given return period. The intensity of the storms decreases with increase of the storm's Duration, and the storm for any particular duration will have greater intensity if its return period is large. In other words, for the storm of known duration, storms that are much higher intensity and with the same duration are less likely to occur (less frequent) from the storms that are less intensity. In a number of design problems that are related to the management and treatment of water for catchments such as surface runoff and erosion control, it is necessary to know the intensity of rainfall for different durations and different return periods.
The maximum precipitation at each hour above of each station for a long time from 1979 until the 2014 was calculated for each year, and at return periods of (2, 5, 10, 15, 25, 50, 100) years, by using HyfranPlus program as shown in Fig. 3. HyfranPlus program fits several statistical distributions namely (GEV, Gumbel, Weibull, Normal, Lognormal, Gamma, Generalized Gamma, Pearson Type 3and Log Pearson Type 3) the results of all statistical distributions are compared using the Akaike and Bayesian information criteria to choose the most appropriate distribution that represents the data.

Generalized IDF Formula
The IDF formulas are the empirical equations representing a relationship among maximum rainfall intensity (as dependent variable) and rainfall duration and frequency (as independent variables). There are several commonly used functions to estimate rainfall intensity, in this research Bernard equation was used [17] as follow: in years, D duration in minute, and c, m, and e are regional coefficients. The results of IDF Formula are as shown in Table 4.  To determine the fitness of the IDF Formulas values in Table 2 with the observed values of the event, the chi-square test was performed. The results of this test were identical to all stations, where the value of all stations between (0.011 -1.87) which is smaller than the tabulated value of the chi-square (16.9) at significance level of (α = 0.05), and R 2 value was 0.98 for all stations.

Hydrological application:
The seasonal valleys in all parts of Iraq are unknown in terms of the amount of seasonal water resources and what is the fate of these quantities, so the Wadi Al-Milh was chosen, it is one of the seasonal valleys in Nineveh, northern Iraq, its estuary coordinates is (42.97 E and 36.48 N), as Shown in Figure 5. Data on the Wadi Al-Milh were collected, the soil type was found to be Clay (22.4% sand, 24.6% silt, 53% clay) and hydrological classification of soil type is (D). Using the ERDAS IMAGINE V.9.1 program in preparation of the land-use map and the following of the Supervise Classification method, which showed six categories according to the USGS system, as in Figure 6. It was found that the dominant species of land use in the Wadi Al-Milh is a moderate pasture land (range land). Digital Elevation model DEM for the study area was obtained using the Global Mapper program. Considering the Mosul station is representative of the Wadi Al-Milh area, using the IDF Formula of Mosul station from Table 4 and for return period of 100 years, It was found that the depth of rain is 30 mm and the design rainfall intensity is 5 mm/hr for this valley at duration of 6 hours. Using the above information and adopting it as input to WMS program, the design hydrograph of the Wadi Al-Milh was found as in Figure 7, and the morphological properties as shown in Table 5.  The results of the WMS program of design hydrograph ( Figure  7) and morphological properties (Table 5) were used to estimate of the Wadi Al-Milh sediment yield using the MUSLE method, which is a summary of Modified Universal Soil Loss Equation. Williams and Berndt [18] developed an equation to estimate the sediment yield in tons resulting from surface erosion due to individual rainstorms as follows: Ys = 0.907 * K * CF * PP * Re * LS ---- (11) Where Ys: Sediment yield (Tons); K: Soil Erodibility Factor is a function of soil texture and its value for Wadi Al-Milh was (0.3); CF: Cover Factor is between 0.15 -0.25 and its value for Wadi Al-Milh was (0.2); PP: Erosion Control Factor and its value for Wadi Al-Milh was (1). For more details [19] and Re: Runoff Energy Factor is calculated as following: Re = 13 * (Qp * V) 0.56 ------- (12) Qp: peak discharge (m 3 /sec) and V: volume of runoff (m 2 ). LS: Length Slope Factor Calculated from the following relationship: A: Basin area (m 2 ); Lt: Total length of the main channels of the basin (m) and the (n) is estimated from the Table 6. The value of LS for Wadi Al-Milh basin is (0.23), Thus, the maximum sediment yield for the single rainstorm with the return period 100 years and duration 6 hours is equal to (4763) tons.

Discussion:
The study of any time series requires analysis into its components (trend, seasonal, periodic, random) to study the behavior of the phenomenon over time and predict its parameter in future based on past.
The results of the hypothesis test showed that the rainfall in northern Iraq is mainly affected by the seasonal variable shown in Table 1, the effect of the periodic variable is very little and there is no effect of the general trend and the random variable, this led to the adoption of equation 6 for prediction using the average instead of the general trend.
Correction of the prediction results in Table 3 using monthly and annual wetness index showed that the adoption of the monthly wetness index is more accurate than annual wetness index form comparison of R 2 values. the R 2 value was calculated from the corrected prediction results with the observed time series for all selected stations in northern Iraq for the period of 2006 to 2010.
The Intensity-Duration-Frequency curve of the selected stations was plot. using In order to ensure the validity of the test of the four hypotheses of the components of the time series (general direction, seasonal changes, cyclical changes, random changes) of the selected stations in northern Iraq, there is no change affecting the time series components over time, the time series of each station were divided into small intervals for 5 years and 10 years and the ANOVA test was done using the SPSS program to compare the averages for small periods 5, 10 years with the average of the total time series on the one hand and with itself on the other, where the SPSS program gives the level of significance of the (F) ratio of the composite effect between the groups, which indicates good performance if the mean values of the groups are the same. The SPSS program uses the LEVENE test, which tests the hypothesis that the variances for all groups are equal. If the hypothesis of data homogeneity collapses, the SPSS program presents an alternative version of F above, this is Welch F ratio. The ANOVA test showed that the Significance level P for the F ratio and Welch F ratio (in the case of collapse of the homogeneity hypothesis) for all small periods (10, 5) years compared with the total time series and with the itself was (P> 0.05). This indicates that there is no significant change in the periods (5, 10) years compared to the total time series, i.e. there is no significant change in the time series components of the selected stations in northern Iraq over time.

CONCLUSIONS
The conclusions resulting from this research are concentrated on the following: 1. The general trend of rainfall time series does not reflect a real impact on the value of the phenomenon, which means that rainfall of the selected station in northern Iraq is not affected by the general trend of the time series 2. There are significant changes in the values of the seasonal index, The fluctuation (rising and falling) in the seasonal index during the months of the year indicates to the effect of the seasonal changes on the monthly rainfall series. i.e. the rainfall in northern Iraq is affected by seasonal variations. 3. Rainfall in northern Iraq is affected by periodic changes but for a period not exceeding 4 consecutive years. And there is no effect of random variables on rainfall time series. 4. The stations of Mosul, Tal-Afar, Sinjar and Kirkuk are considered semi-dry according to the Koppen climate classification. While Erbil, Duhok, Darbandikhan, Sulaymaniyah and Dokan stations are semi-humid (mild winter and dry and hot summers) according to the Koppen classification. where it was compatible with the Thornthwaite classification. 5. The use of the Monthly Wetness Index is much better and more accurate than the use of the Annual Wetness Index to correct the forecast results of the monthly rainfall data. 6. The values of the R 2 between the predicted monthly rainfall data after correction (using equation 6) and the actual monthly rainfall data for the all selected stations were between (0.992-0.996) indicating the effectiveness and accuracy of the modified method (equation 6) used in this research in prediction of the monthly rainfall data. 7. The maximum sediment yield of the Wadi Al-Milh (with hydrological classification of the soil type is D; land-use is range land and the curve number is 84) for a single rainstorm at the return period of 100 years and duration of 6 hours is equal to (4763) tons.