Spatial-temporal analysis and projection of extreme particulate matter (PM10 and PM2.5) levels using association rules: A case study of the Jing-Jin-Ji region, China

The Jing-Jin-Ji region of Northern China has experienced serious extreme PM concentrations, which could exert considerable negative impacts on human health. However, only small studies have focused on extreme PM concentrations. Therefore, joint regional PM research and air pollution control has become an urgent issue in this region. To characterize PM pollution, PM10 and PM2.5 hourly samples were collected from 13 cities in Jing-Jin-Ji region for one year. This study initially analyzed extreme PM data using the Apriori algorithm to mine quantitative association rules in PM spatial and temporal variations and intercity influences. The results indicate that 1) the association rules of intercity PM are distinctive, and do not completely rely on their spatial distributions; 2) extreme PM concentrations frequently occur in southern cities, presenting stronger spatial and temporal associations than in northern cities; 3) the strength of the spatial and temporal associations of intercity PM2.5 are more substantial than those of intercity PM10. © 2015 Elsevier Ltd. All rights reserved.


Introduction
Air pollution issues have received increasing attention in many developing countries due to frequent haze events, which have become a severe threat to people, especially in China (Hu et al., 2013). Historically, serious disasters related to air pollution have been recorded, such as the 1930 Meuse Valley fog (63 people killed), 1948 Donora smog (20 people killed and 7, 000 more ill) and the 1952 London smog (4000 more people killed and 100, 000 more people ill). These heavy smog events were mainly attributed to accumulated industrial air pollutant emissions and adverse climatic conditions during those periods. As urbanization, industrial development, vehicle usage and other environmentally hazardous processes increase, more fossil fuels are burned, resulting in increased carbon dioxide, nitrogen dioxide, sulfur dioxide, ozone and particulate matter (PM) emissions. Among those emissions, PM (a mixture of solid particles and liquid droplets) is one of the most harmful air pollutants to human health (Hu et al., 2013;Qin et al., 2014;Elbayoumi et al., 2014), especially in extreme PM condition. The World Health Organization (WHO) reported that a short-term PM increase of 10 mg=m 3 can be associated with increased coughing, lower respiratory symptoms and hospital admissions involving respiratory problems (WHO, 2000). Exposure to fine particulate matter, PM 2.5 that is able to highly penetrate the lungs, is strongly correlated with increased rates of mortality, morbidity, respiratory and cardiovascular problems, particularly among children (Elbayoumi et al., 2014;Pope et al., 2009).
To decrease emissions, a detailed PM analysis is necessary for researchers and governmental departments, who seek to further understand the related health effects and formulate effective air pollutant emission control measures to target PM reductions (Yu et al., 2013). In recent years, numerous models have been developed to identify the interactions among PM and its emission sources (Chen et al., 2010;Guo, 2011;Winijkul et al., 2015;Ancelet et al., 2014;Hasheminassab et al., 2014;Villalobos et al., 2015;Huertas et al., 2014). Moreover, a dynamic factor analysis (DFA) was developed by extracting the major variables and the associated temporal variations from multivariate time series data (Yu et al., 2013). Kuo et al. (2011) applied the DFA to analyze PM 2.5 data and spatially identify PM 2.5 temporal variations in southern Taiwan. Additionally, the nonlinear properties of PM data have been investigated due to the nonlinearities that are inherent in atmospheric systems (Tsonis et al., 1994;Yu et al., 2013). Hu et al. (2013) proposed that examining the spatial and temporal distributions of PM concentrations was useful for reducing air pollution with effective strategies based on the following reasons: spatial analyses can help to identify spatial PM distribution patterns; and temporal analyses of PM time series data are helpful for detecting variation periods and trends. The majority of the pre-existing research focused on original PM data, while few studies analyzed extreme PM levels that would be a most threat to human health.
To take effective measures to control PM levels, it is necessary to understand the extreme spatial and temporal characteristics of PM. Thus, this study focuses on air pollutants collected from Feb. 2014 to Feb. 2015 in the Jing-Jin-Ji region (encompassing 13 cities), which has always been the most highly air polluted region in China. Because the frequency and severity of the extreme PM values has increased in those cities, joint regional PM research and air pollution control in the Jing-Jin-Ji region has become an urgent issue. The objective of this study is to examine a comprehensive spatialtemporal analysis and projection for extreme PM issues in the Jing-Jin-Ji region, initially using an Apriori based association mining technique.
The main contributions of this paper are: 1) an initial examination into the spatial and temporal association rules of intercity PM over Jing-Jin-Ji using an Apriori algorithm; 2) a quantitative estimate of the extreme PM frequency and maximum projection units; 3) a pioneering attempt to mine the cross spatial temporal associations of PMs (PM 2.5 and PM 10 ) in the Jing-Jin-Ji region.
This paper is organized as follows: Section 2 introduces the Apriori algorithm based association rules. The study region and data analysis are presented in Section 3. A case study and discussion are presented in Section 4. Finally, we present our conclusions and discussion in Section 5.

Apriori algorithm-based association rules
Data mining techniques are typically applied to extract information from large databases and present the data in an easily understandable format (Witten and Frank, 2005). As specific data mining tools, Apriori algorithm-based association rules are adopted to examine the interesting relationships between two or several items that occur synchronously, in the context of information management, decision making, process control and other applications (Guo et al., 2014;Cheng and Li, 2015).
In association rule mining, the database can be stored in the form of transactions, D, and each transaction, T, is composed of nonempty sub-items, I (Soysal, 2015). An association rule is an expression of the form X 0 Y. Herein, X3I; Y3I; X∩Y ¼ ∅, X and Y are the antecedent and consequent of the rule, respectively (Guzzi et al., 2014). Two metrics are widely used to measure the strength of an association rule, deemed the support and confidence. The support (X 0 Y) represents the rate of X∪Y contained in D and is given by Eq. (1) (Guo et al., 2014): The support measure evaluates the statistical importance of D. The higher the value, the more important the transaction. Confidence represents the rates of both X and Y contained in D and is given by Eq. (2) (Guo et al., 2014): where support(X) denotes the rate of X contained in D. The confidence measure evaluates the level of confidence of the association rule.
In addition to these two measures, the lift measure is also applied to identify patterns and trends. The lift (X 0 Y) is given by Eq.
(3) (Soysal, 2015): The lift measures the dependence of Y on X, which can be interpreted as the degree of lifting confidence between X and Y, from a global existence (support (Y)) to a local association (confidence (X 0 Y)) (Soysal, 2015). More detailed information of this method can be referenced to (Cheng and Li, 2015;Soysal, 2015).
The association rules are expected to discover the presence of pair conjunctions, which appear in a data set. In time series analyses, intra-transactional association rules are adopted to reveal correlations between multiple time series, but rarely to project time series trends (Qin and Shi, 2006). However, association rules do not typically consider temporal relations. To reveal comprehensive and robust relations within extreme PM data from Jing-Jin-Ji, we performed spatial-temporal association rule mining in this study. Fig. 1 presents a simple process for the proposed rule method.

Study region and data analysis
Jing-Jin-Ji is the national capital region of China, and also the largest urbanized region in Northern China. Economic hinterlands exist around Beijing, Tianjin and the Hebei province, along the coast of the Bohai Sea (Jing-Jin-Ji, 2014). This metropolitan region includes 13 major cities, including Baoding, Beijing, Cangzhou, Chengde, Handan, Langfang, Hengshui, Qinhuangdao, Shijiazhuang, Tangshan, Tianjin, Xingtai and Zhangjiakou. 1 The specific location of the Jing-Jin-Ji region, as well as the locations of the major cities, is shown in Fig. 2

(I).
This emerging region has experienced a tremendous increase in urbanization in recent decades. It was accompanied by an increase in primary air pollutant emissions and degraded air quality (Feng et al., 2015;Air Pollution in China, 2015). To address the heavy air pollution in Jing-Jin-Ji, China's new air pollution action plan noted that in the Jing-Jin-Ji region, the annual average PM 2.5 concentrations should decrease by 26%, 19% and15%, respectively, in 2017 (CAAC, 2014). In addition, the Beijing, Tianjin and Hebei (also named Ji) governments formulated and circulated their own implementation plans, the Beijing clean air action plan, Tianjin 2012e2020 air pollution control measures plan and the Hebei implementation scheme and action plan for air pollution prevention and control (Yang et al., 2014).
In our study, the hourly concentrations of air pollutants (PM 10 , PM 2.5 , NO 2 , SO 2 , O 3 and CO) were collected in Jing-Jin-Ji region from Feb. 2014 to Feb. 2015, covering 13 cities and 389 days. Due to the unavailability of electricity, faulty transmission networks and other reasons, missing data must be addressed prior to analysis and modeling. The strategy that was used to assess missing values was the cubic spline Hermite interpolation (CSHI) technique (Qin et al., 2014).
A brief statistical summary of PM 10 and PM 2.5 is given in Fig. 2(IIeV). Fig. 2(II) presents the overall frequency distributions of PM 10 and PM 2.5 in the Jing-Jin-Ji region over the study period. The highest frequency of PM 10 occurred in the interval from 50 À 150 mg=m 3 , with a value of 0.2324, which includes 28,206 data points (the sample size, N, was 121,368). For PM 2.5 , the highest frequency occurred between 20 and 60 mg=m 3 , with a value of 0.3328, which includes 40,391 data points. This distribution also shows the relatively extreme PM (PM 10 and PM 2.5 ) levels, which are much larger than the 75 mg=m 3 and 150mg=m 3 air quality standards for PM 2.5 for PM 10 , respectively. According to the air quality index (AQI) of China, air pollution can be grouped into six levels: excellent, good, lightly polluted, moderately polluted, heavily polluted and severely polluted. Typically, moderately polluted or worse classifications can cause health irritations, meaning that individuals with breathing or heart problems should reduce or avoid outdoor exercise.
The corresponding concentration limits for PM 10 and PM 2.5 are 250 and 115 mg=m 3 , respectively. Herein, the PM data are set to be extreme PM if larger than or equal to the specific concentration limits. Fig. 2(III) provides the proportion of extreme PM in 13 cities from the Jing-Jin-Ji region. The pie charts illustrate that Xingtai suffered from the most severe PM pollution, with the highest PM 10 and PM 2.5 frequencies of 33.73% and 43.94%, respectively. Xingtai is followed by another 5 cities that experienced considerably high extreme PM concentrations, being Baoding, Shijiazhuang, Hengshui, Handan and Tangshan, with the extreme PM 10 /PM 2.5 frequencies of 31.15/41.21%, 25.24/37.57%, 21.67/33.39%, 21.32/36.30% and 16.73/31.59%, respectively. Conversely, the least amount of PM pollution occurred in Zhangjiakou, accounting for only 2.98% of extreme PM 10 and 3.10% of extreme PM 2.5 , which are better than values from Chengde, Qinhuangdao and Beijing. In the majority of the cities, the PM 10 and PM 2.5 concentrations followed a similar trends. Taking Beijing, Tianjin and Shijiazhuang as examples, the relationship between PM 10 and PM 2.5 (see Fig. 2(IV)) exhibits a Fig. 1. Spatial-temporal association rule mining. I: process of identifying the X and Y, where n ¼ 9336 (the length of PM time series), p ¼ 13 (the number of study cities) and k is the projection step. When k ¼ 0, only spatial analysis is investigated; II: pseudo-code of the Apriori code for rule mining. strong linear relationship (r > 0.88). Additionally, Fig. 2(V) illustrates the PM boxplots, which display the spatial and temporal PM data variations. Specifically, Fig. 2(V-1) and Fig. 2(V-2) display the overall PM 10 and PM 2.5 variations for each month, respectively. In January, April and May, the PM concentrations are relatively small, exhibiting stable variations in the Jing-Jin-Ji region. Comparatively, 3 most severe PM problems occurred in September, October and November, exhibiting high concentrations and variations. From April to November, PM values appeared a gradually rising trend, which means that air quality worsened. Following that period, PM values dropped and remained stable over the next three months. Over the study period (Feb. 2014eFeb.2015, each city shared similar variations in PM trends (see Fig. 2(V-3) and Fig. 2(V-4)). Outliers are marked with red color. It is indicated that all study cities experienced extreme PM concentrations, which can also be seen in Fig. 2(III).
Because the frequency and severity of the extreme PM values has worsened in those cities, joint regional PM research and air pollution control become an urgent issue in the Jing-Jin-Ji region.

Results and analysis
In this section, we present the major results of the association rules from the 13 cities in the Jing-Jin-Ji region. Spatial associations can be found in Section 4.1, and spatial-temporal rules are described and discussed in Section 4.2. We set the minimum support ¼ 0.1 and minimum confidence ¼ 0.7 in this paper. Table 1 shows the PM 2.5 spatial associations from 13 cities in the Jing-Jin-Ji region, which means that association cities (AC, a set that contains certain cities from this region that can influence other cities) were mined and listed in this table. The authors use AC Beijing to represent Beijing's AC. It is worth to point that, in Baoding and Xingtai, all of the cities that border them are not always their AC, as shown in Fig. 3 Beijing (bordered by Baoding) and Handan (bordered by Xingtai) are not association cities of Baoding and Xingtai, respectively. In addition, the PM 2.5 rules of the 13 cities in the Jing-Jin-Ji region are not totally depended on their spatial positions. For example, Cangzhou, which is near Tianjin, Langfang, Baoding and Hengshui, does not belong to any city's AC. This situation also occurs in Beijing and Handan. For a clear demonstration of the frequency that city A i belongs to other cities' AC, we propose the following definition.

Spatial associations among 13 cities in Jing-Jin-Ji region
Definition 4.1.1 If support(city A i 0 city A j ) > ε and con- Thus, the frequency that city A i belongs to other cities' AC (FeqAC) is defined by Eq. (4).

FeqAC City
where 1 ACA j ðxÞ is an indicator function (The indicator function for AC, 1 AC (x) equals to 1 if x 2 AC and 0 if x ; AC.), ε and h represent minimum support and minimum confidence and N is number of cities. It is clear that cities whose FeqAC values are 0 have lower confidence and lift than other cities. This is particularly true of Beijing and Cangzhou, whose confidence values are below 0.8. Hengshui also did not obtain significant confidence and lift values because FeqAC (Hengshui) is less than Baoding, Langfang, Shijiazhuang, Tangshan, Tianjin and Xingtai, as illustrated in Fig. 4 (The vertical coordinate of Fig. 4 only represents values of FeqAC, confidence and lift). Specifically, the confidence values of these four cities, Beijing, Cangzhou, Handan and Hengshui, are less than 0.9, and their lift values are below 5, which indicates that those cities are significantly different from Baoding, Langfang, Shijiazhuang, Tangshan, Tianjin and Xingtai (when setting confidence ¼ 0.9 and Lift ¼ 5, they will be divided into two parts).
In terms of PM 10 , spatial associations among the 13 cities of the Jing-Jin-Ji region are listed in Table 2. Compared with Table 1, the strengths of the PM 10 rules are poorer than those of PM 2.5 . From the perspective of geographical position, Fig. 5 provides evidence that the rules among the 13 cities are not totally dependent on city position because Xingtai and Handan do not border, Xingtai and Handan, but influence the PM 10 concentrations there. It is displayed via the white rectangles, which contain arrows from Xingtai to Baoding and Handan to Shijiazhuang in Fig. 5.
Based on these figures and tables, it is apparent that PM 2.5 and PM 10 exhibit many differences. When using PM 2.5 to search for relationships among cities in the Jing-Jin-Ji region, we found that there are 3 independent cities, indicating that there is no city associated with them and they do not belong to any cities' AC. However, when mining the association rules of PM 10 , there are 8 cities that do not have any relations with other cities, being Zhangjiakou, Chengde, Qinhuangdao, Beijing, Tianjin, Tangshan, Cangzhou and Hengshui. Among them, Zhangjiakou, Chengde and Qinhuangdao do not have any relationships with other cities regarding rules of PM 2.5 or PM 10 . Considering variations between PM 2.5 and PM 10 , there are two major distinctions: 1) the FeqAC values of PM 2.5 and PM 10 are significantly different; and 2) the number of elements in each city's AC decreases from PM 2.5 to PM 10 . Both distinctions might result from the complicative source materials and physical characteristics of PM 2.5 and PM 10 . This distinctions also imply that extreme PM 10 occurrences are more independent than extreme PM 2.5 , and the intercity PM 2.5 relationship among the 13 cities is stronger than that of PM 10 .

Spatial-temporal association rules
After analyzing the spatial relationships among cities in the Jing-Jin-ji region, spatial-temporal association rules are further mined. These rules can reveal the possible PM movement paths, which indicate how and when city A's PM influences city B's PM. In this section, AC Beijing,1 is a set of Beijing association cities with a projection units value of 1.

Spatial-temporal associations of PM 2.5
First, the spatial-temporal PM 2.5 association rules among these cities will be discussed from a perspective max projection units. AC and the support, confidence and lift of rules are shown in Table 3.
In Table 3, Xingtai exhibits the highest value of max projection units, equaling 24. It indicates that extreme PM 2.5 events in this city can be deduced according to its AC's condition within 24 h with a confidence of 0.7593. There are three cities whose max projection is 0, which indicates they do not have any temporal relationships with other cities. From a geographic perspective, those three cities are located in the northern portion of the region, reflecting that extreme PM 2.5 events typically occur in the southern portion of the region. Given that the max projection units of Baoding, Handan, Shijiazhuang and Xingtai are over 10, and both Beijing and Tianjin   are core cities, thereby those 6 cities will be used to demonstrate their specific association rules in this region.
In Table 4, from the perspective of AC Beijing, i (i ¼ 0,1, …,5), there are two groups, {Baoding, Langfang, Shijiazhuang, Tangshan, Tianjin, Xingtai} and {Baoding, Handan, Hengshui, Langfang, Shijiazhuang, Xingtai}. The former's projection units values range from 0 to 2, while the latter's range from 3 to 5. From Table 4, it is apparent that Tianjin and Tangshan do not influence Beijing's PM 2.5 when the projection units value is greater than 2, while Handan and Hengshui will affect Beijing within three hours. It reflects that Beijing will not be influenced by the high PM 2.5 concentration that occurs three hours ago in those cities' located to the east of Beijing. However, the extreme PM 2.5 in the cities to the south of Beijing will influence the Beijing's PM 2.5 significantly, implying that those southern cities' high PM 2.5 occurrences would play a key predictor to Beijing's extreme PM 2.5 , especially within 3e5 h. Tianjin's spatial-temporal rules are listed in Table 5. The projection units of Tianjin is larger than that of Beijing, which suggests that we can deduce Tianjin's high PM 2.5 events longer than Beijing's. In this table, AC Tianjin, i (i ¼ 0,1, …,7) is divided into three groups, when i is equal to 2, 5, 7, respectively. Through observing the geographic positions and projection units of these three groups, a similar conclusion can be drawn that cities located to the south of Tianjin will influence Tianjin's pollution situation in 6e7 h if high PM 2.5 events occur currently.
Detailed information for other cities is shown in Fig. 6. It illustrates the mining results. In this figure, the horizontal axis and vertical axis of each subgraph represent projection hours and cit-y_NUM, which is shown at the bottom of the figure, respectively. Each point in a subgraph represents an association city. This means that the AC for each projection hours, K, will be obtained through transforming the city_NUMs of points to city names using the information at the bottom of the figure. Specifically, [1 5 7 10 11 12] means {Baoding, Handan, Langfang, Tangshan, Tianjin, Xingtai}. Fig. 6(a) illustrates Shijiazhuang's rules. It is clear that AC Shijiazhuang, i (i ¼ 0,1, …,10) consists of four groups, which are represented by solid blue, red and green circles and hollow blue circles. The first change happens when the projection units value is 6. Hengshui is excluded in AC Shijiazhuang,6 , but with Handan included. Then, when projection unites values range from 7 to 9, Tangshan and Tianjin do not influence Shijiazhuang's future PM 2.5 . For Handan, AC Handan, i (i ¼ 0,1, …,10) has only one type, which is {Baoding, Langfang, Shijiazhuang, Tangshan, Tianjin, Xingtai} (see Fig. 6(b)). In addition, the highest lift among those 10 spatial-temporal rules indicates that the lift values do not monotonously decrease as the projection units values increase. Fig. 6(c) illustrates three main AC Baoding, i (i ¼ 0,1, …,22) groups and four single groups, which are represented by hollow blue circles. When the projection hours values range from 5 to 8, Beijing's PM 2.5 concentration shows an influencing trend to Baoding's. As AC Baoding, i, (i ¼ 5, 6, 7, 8) decreases, the ratio of Table 3 Overall spatial-temporal PM 2.5 associations among 13 cities for in the Jing-Jin-Ji region. southern cities in AC Baoding, i, (i ¼ 5, 6, 7, 8) increases. Overall, when the projection units value is greater than 14, Tangshan and Tianjin, the northern cities in the Jing-Jin-Ji region, do not affect Baoding. However, Handan and Hengshui, both southern cities, give rise to influence Baoding's PM 2.5 . This means that southern cities have an increasing effect when the projection units increase in Baoding. Fig.  6(d) illustrates the spatial-temporal rules in Xingtai with the highest max projection units among 13 the cities in the region. It is different from other cities because there are five main groups in Xingtai, which are represented by blue, red and green circles and red and blue rectangles, whose number of elements is 2, respectively. There are two major changes in Fig. 6(d). Both changes suggest that the number of elements in AC Xingtai, i decreases, while the ratio of southern cities in AC Xingtai, i increases. Thus, the main spatial-temporal PM 2.5 rules have been demonstrated, and the next step is to illustrate the PM 10 rules.

Spatial-temporal associations of PM 10
In Table 6, Hengshui exhibits the highest max projection units, equaling 6, which is less than the highest max projection units in Table 3. The max projection units values greater than 0 occur in only four cities, which suggests that the intercity PM 10 relationship among the 13 cities in this area is poorer than that of PM 2.5 . From this table, Xingtai will experience a similar situation within 5 h if Baoding and Langfang suffer extreme PM 10 . For the other two major cities, Beijing and Tianjin, the results indicate that their PM 10 pollution does not correlate with other cities in the Jing-Jin-Ji region.
For Baoding, there are two AC groups, which are {Langfang, Xingtai} when the projection units values range from 0 to 2, and {Hengshui, Xingtai} when the projection units value is 3. This indicates that southern cities have a greater effect on Baoding as projection units increase because Hengshui is located further south than Langfang is. A similar situation occurs in Shijiazhuang because there are also two groups of AC, one of which, {Hangdan, Xingtai}, appears when the projection units values range from 0 to 2. The other, {Hengshui, Xingtai}, appears only when the projection units value is 3. Both Hengshui and Xingtai have only one AC group. The former's group is {Shijiazhuang, Xingtai} and latter's is {Hengshui, Shijiazhuang}.

Cross spatial-temporal associations of PM 2.5 and PM 10
To this point, we have discussed the spatial-temporal PM 2.5 and PM 10 associations. However, it remains unknown if a specific city's PM 10 will be affected by other cities' extreme PM 2.5 , and vice versa. To answer this question, spatial-temporal association rules are mined based on cross-PMs (see Tables 7 and 8). For Table 7, it is clear that there are only two cities whose PM 10 values exhibit relations with the PM 2.5 values of other cities. This indicates that the intercity PM 10 relationships are stronger than the cross-PMs (PM 2.5 / PM 10 ) relations. However, for a single city, the max projection units values of Baoding and Xingtai in Table 7 are larger than those in Table 6, which means that PM 10 influences PM 10 less stronger than PM 2.5 does in both cities. In Fig 7, the detailed cross spatial-temporal associations (PM 2.5 / PM 10 ) of the Baoding and Xingtai are illustrated. From this figure, it is clear that southern cities have a greater effect on Baoding and Xingtai as the projection units value increases.
The results from another cross case are shown in Table 8. In this table, the max projection units value of Shijiazhuang is equal to that of Table 3, which means that the PM 10 of Hengshui and Xingtai can influence the succeeding 10 h of PM 2.5 values in Shijiazhuang. However, the max projection units values for other cities are less than those in Table 3. So, the cross spatial-temporal rules of Shijiazhuang will be discussed specifically. Table 9 shows the cross spatial-temporal association rules (PM 10 / PM 2.5 ) in Shijiazhuang. Clearly, there are three groups in the column "AC", being {Baoding, Hengshui, Xingtai}, {Handan, Xingtai} and {Hengshui, Xingtai}, respectively. From a geographic perspective, southern cities increasingly influence Shijiazhuang as the projection units increases because Baoding does not belong to AC after the projection units value reaches 4.

Conclusions and discussion
This study presents a spatial-temporal analysis and projection of extreme PM concentrations around the Jing-Jin-Ji region of China during the period from Feb. 2014 to Feb. 2015. The Apriori algorithm-based association rules mining technique is used to discover the spatial-temporal relations between a target city and other potential reference cities. The findings and their explanations are summarized as follows: (1) The intercity PM associations do not completely rely on spatial distributions, which means that there are no associations among several cities that border one another, and vice versa. For example, Cangzhou is surrounded by Tianjin, Langfang, Baoding and Hengshui, but not associated with   them. This result also agrees with that in Beijing and Handan. The low support and confidence values in Beijing imply that PM concentrations therein are poorly associated with other reference cities. It might be explained by that the PM pollution in Beijing mainly comes from intra-city emissions, such as fossil fuel combustion, which is probably due to the significant vehicle increasing in recent years. This rising trend is expected to continue. (2) Over the Jing-Jin-Ji region, extreme PM events frequently occur in the southern cities, and the strength of the spatial and temporal association PM rules is higher for southern cities than northern cities. In addition, PM in southern cities has a greater influence on northern cities as projection units values increase. This result can be explained by the topography and wind direction in this region. From the perspective of topography, the Taihang Mountains are located to the west of the Jing-Jin-Ji region. Therefore, strong winds from the west are extremely rare, which was statistically estimated by the historical weather data (Lishi Tianqi, 2015). This indicates that PM can move in three possible directions, being north to south, south to north and east (the sea) to west (the mountains) if wind is responsible for PM transportion. For the recorded data (Lishi Tianqi, 2015), the average blowing time from the south in southern cities is longer than from the north in northern cities. This indicates that PM from southern cities has a high probability to be transported northward by the wind. This conclusion from topography and wind direction agrees with the result that PMs of southern cities influences that of northern cities more. Therefore, a substantial reduction in PM emissions in the southern cities of the Jing-Jin-Ji region is of critical significance for reducing PM concentrations and improving the air quality across this region.
(3) The spatial and temporal intercity PM 2.5 associations are much stronger than PM 10 . This is because PM 2.5 is suspended in the air for a longer period of time and is easily transported between cities. PM 2.5 in Jing-Jin-Ji can be attributed to multiple sources and is mainly dispersion-governed. PM 10 is likely of local origin and more affected by traffic vehicles. It was estimated that the contribution of traffic to coarse mode urban ambient concentrations was larger on weekdays than that on weekends, due to higher traffic flows (Barmpadimos et al., 2011). Gugamsetty et al. (2012) suggested that soil dust and vehicle emissions were the major sources of PM 10 , and the dust caused by traffic would inevitably increase PM 10 concentrations. The higher intercity PM 2.5 associations also indicate that spatial and temporal PM 2.5 projections can be easily and reliably conducted. When extreme PM 2.5 events occur in certain reference cities, the target city will suffer extreme PM 2.5 at a probability of greater than 0.7. However, the same cannot be said for PM 10 , due to poor intercity PM 10 associations. (4) The maximum projection units are also estimated, through which an extreme PM incidence can be predicted 1maximum projection in advance. In addition, as the projection steps increase, the association cites for a target city will change, and their association strengths rarely decline.
We believe that this is an initial research to identify the spatial and temporal association rules of extreme PMs for the Jing-Jin-Ji region, an core but severely polluted area around the capital of China. This investigation was performed using an Apriori algorithm, which was demonstrated to be an effective method for extreme PM data mining. The mining results agreed with the real world situation, suggesting that they were reasonable and reliable. These results may reveal helpful information for researchers and policy makers regarding PM pollution. The mining results can be applied to formulate effective strategies for reducing extreme PM incidents and improving air quality in the region. In addition, this research provides each city's max prediction steps. These values are beneficial for building PM 2.5 and PM 10 warning systems, which can reduce the harmful effects of pollution and enable relevant parties to take the necessary precautions.