TMCnet News

Influence of the Data-Fusion to Clustering Lifetime in Wireless Sensor Networks [Sensors & Transducers (Canada)]
[April 22, 2014]

Influence of the Data-Fusion to Clustering Lifetime in Wireless Sensor Networks [Sensors & Transducers (Canada)]


(Sensors & Transducers (Canada) Via Acquire Media NewsEdge) Abstract: Wireless sensor networks (WSN) are wireless networks composed of spatially distributed autonomous devices using sensors to cooperatively monitor physical or environmental conditions, such as temperature, sound, vibration, pressure, motion or pollutants, at different locations. This paper deals with the influence to data fusion of part of the relevant data on clustering network in the wireless sensor networks (WSN). According to the results of the study, the nodes' energy consumption largely depends on their location: although some of the nodes are comparatively far from sink node, they may consume more energy. The study results further show that there is a trade-off between the total energy consumption of the networks and the network lifetime, i.e. even if the total energy consumption of the networks reaches its minimum, the network lifetime is not necessarily the longest. Copyright © 2013 IFSA.



Keywords: WSN, Data fusion, Energy consumption, Network lifetime.

(ProQuest: ... denotes formulae omitted.) 1. Introduction WSN has broad application prospects, with significant scientific research value and immense practical value in such fields as military affairs «fe national defense, industry & agriculture, urban administration, biomedicine, environmental monitoring, emergency & disaster rescue and remote control of dangerous zones, etc. Therefore, it has drawn high attention from the military, academic and industrial fields of many countries, and, as one of the publicly recognized new hot frontier research fields in the 21st century, has been considered one of the technologies that will exert a great effect on the 21st century [3, 4].


Compared with the traditional Ad Hoc networks, the routing in WSN is largely designed to reduce the node energy consumption and raise the network lifecycle, whereas, the traditional wireless Ad Hoc networks' routing protocol is primarily designed to provide high-quality service in motional conditions. Therefore, the various mature technologies widely applied in the traditional Ad Hoc networks cannot be applied in WSN [1, 2]. In recent years, due to its great advantage of effective extending network lifecycle, layered WSN has become one of the hot research issues.

In the WSN, clustering algorithm can improve the network expandability. After dividing the networks into several clusters, a cluster-head node will be chosen from each cluster. The other nodes in each cluster are to transmit data to the cluster-head, which will be fused at the cluster-head node and then transmitted to sink node, thus saving the energy.

There have been lots of literature dealing with the clustering algorithm in Ad Hoc networks [5-8] and that in WSN [9, 10]. These protocols either ignore the correlation among the nodes' data transmission in the networks, or put data fusion in an ideal condition, i.e. absolute data correlation, so that the data packets in one cluster can be compressed into one. However, in factual WSN, there is a close relationship between degree of data fusion and data correlation. Therefore, it is quite necessary to explore the cluster features related to part of the data.

2. Node Characteristics and Network Clusters WSN is usually composed of sensor nodes, sink node and task management node. And its network structure is shown by Fig. 1. The data monitored by sensor nodes are transmitted along the Hop models of other sensor nodes, during which the monitored data may be processed by several nodes, routed to sink node after several times of skipping and finally reach the task management node via the Internet and or satellite. The users carry out sensor network configuration and management through the task management node, assign monitoring tasks and gather the monitored data.

Suppose the network consists of N nodes altogether, including one sink node, which randomly scatter within the domain with an area of A with the density. If there are adequate area of A and node number of N, the nodes to process data can be treated as a Poisson's point with an intensity. Suppose the sink node is located at the center of the network and all the nodes including sink node are fixed. It is certain that the study also applies to the sensor network composed of a number of clusters, with each sink node governing certain fixed nodes. If so, every cluster should be analyzed independently.

Before the clustering of the whole network, the following hypotheses of the sensor nodes and network under investigation will be made: (1) All of the nodes have the same emission power, and thus have the same wireless transmission radius R. (2) The data accepting by the nodes in the network can all be treated as Poisson process with an intensif (3) When there is a need for data transmission between the two nodes without the wireless transmission radius of the other, other nodes can be utilized as relays to fulfill the data transmission. (4) Within any of the clusters, when there is a distance of d between the non-clusterhead nodes and the cluster-head node, the data transmission between them involves the Hop of k=[d/R]. (5) Each relay consumes one-unit energy when transmitting one-bit data. (6) The transmission is free from error, therefore no retransmission of the nodes is needed.

The clustering algorithm is as follows: Suppose every node in the network is selected as cluster-head node with a probability of p. When any of the nodes is selected as cluster-head node (referred to as "active cluster-head"), it will transmit a "Cluster-head Statement" to other nodes within its wireless transmission radius to confirm its status of clusterhead node, which can be received by all of the nodes within the Hop k radius of the cluster-head node. Then, if a node has not been selected as cluster-head node within its Hop k radius, it will join the cluster of the closest cluster-head node to it. "Cluster-head Statement" is designed to be transmitted for k Hop at most, therefore, if a node cannot receive the "Clusterhead Statement" from the cluster-head node within a period of t (here, "t" refers to the time needed for the data transmission from any of the nodes within Hop k radius to cluster-head node), it can be determined that the node is not within the Hop k radius of the "Active Cluster-head" and is then referred to as a "Passive Cluster-head". Moreover, there will be no more than k Hops from any of the nodes within a cluster to cluster-head node, therefore, cluster-head node can transmit a group of fused data to sink node for every unit time of t.

3. Routing Agreement Used in Models When data are transmitted from a non-clusterhead node to cluster-head node or from cluster-head node to sink node, the Minimum Hop Routing (MHR) Agreement will be adopted. Suppose every node in the network can detect data and transmits them to cluster-head node. After receiving the data packets from all of the cluster nodes, cluster-head node will carry out data fusion with a distortion rate of D, and then transmit the fused data packet to sink node through MHR Agreement. Suppose one of the worst cases: every node transmits the data packet with the same size with a distortion rate of DO, which will be only fused at the cluster-head node. Thus, the distortion rate of DO of every data packet becomes the average distortion rate, which can be figured out after the data fusion at the cluster-head node.

4. Energy Consumption in Data Transmission For the convenience of making analysis, the following hypotheses will be made: (1) The sensing area is a circle with a radius of kR, with R being the transmission radius of sensor node and k being a positive integral; (2) Node distribution is a Poisson distribution with an intensity of À ; (3) If the distance from sensor node to Sink Node is r, the expected value of the energy consumed for one-bit data transmission from the node to sink node is Ec(r); (4) If the distance from sensor node to sink node is r, the volume of the data packet (including selfproduced packets) to be transmitted by the node is Nc (r).

Fig. 2 shows the topological diagram for analysis. If there is an adequate node distribution density and consequent adequate routing, it can be inferred that the data transmission from node r to sink node through MHR Agreement will cost a hop count of ... (here, |r| is the distance between node r and sink node). Just as is shown in Fig. 2, we can maintain that the sensing area consists of different layers, which form concentric circles of R centering on sink node.

Suppose node r lies in the layer of k and within the wireless transmission radius of node x in the layer of k-1, and domain A (x, r) is the interface between node r's wireless transmission sphere and the layer of k-1. Then, all the nodes in domain A (x, r) (i.e. the shaded portion in Fig. 2) can be the next hop target of node r. If the number of the nodes distributed in domain A (x, r) is counted as N^ (x, r), NA (x, r) equals to A. A (x, r). If the probability of the n nodes distributed in domain A (x, r) is expressed in p {N4 (x, r)=n}, the following equation can be reached: ... (1) If p(x, r) is the probability of choosing node x as the next hop of node r for data packet transmission, the following equation can be reached: ... (2) Angle ß and angle a in Fig. 2 must be first figured out through the triangle cosine theorem before working out the area of domain A (x, r): ...(3) ...(4) Here, |x| refers to the distance from node x to sink node. Then, the area of shaded domain A (x, r) can be figured out through the following formula: ...(5) The number of the data packets to be transmitted from node x can be worked out by the following integration formula, in which the interface (A(x, s)) of node x 's wireless transmission domain and Layer k is the integration domain.

...(6) In the above formula, ... Nc (x) is the number of 2\x\r the data packets transmitted from node x, and 1 is added because there are also a data packet detected by node x to be transmitted.

5. Total Energy Consumption Data Dependency In WSN, there is a certain dependency among the data detected by neighboring nodes, which will be somewhat distorted in the data fusion at cluster-head node or sink node, which is discussed below.

In the detection region of one of the sub-clusters in the sensor network, the definition of X = (Xi , i=l,2,3..., n} means the actual data detected in n regions by n sensor nodes, with X representing the fused data value of X, d ( X, x) representing the distortion function between the two values, mean square error (MSE) describing the distortion function d (x, x), i.e. d(X,X)=\\X -X||2. Now, the upper limit of the total distortion after the data fusion is expressed as D, i.e. the average distortion degree is limited by E(d(X,X))<D, and the rate distortion function can be expressed by the following equation (7): ...(7) In the above equation, I( X ,X) refers to the mutual information between x and X. When the information source is defined (i.e. the content to be detected by sensor node is defined), concrete distortion degree is also defined, and a certain distortion is permitted, the less the needed rate of information source transmission, the better.

If the distortion function is expressed with MSE, Gauss source is the worst compared to other sources, because it demands the most self-information in the description of source vector [11]. For the convenience of description, X is taken as the random vector of multivariable Gauss.

Suppose, in the network, the dependency between the data packets detected by two nodes is the descent function of Euclidean distance, i.e. the closer two nodes are, the stronger the dependency of the data they detect is. If distortion function d(X,X) is described by MSE, it can be inferred based on information theory and limited by the total distortion degree E(d( X ,X)) < D, rate distortion function R(D) in Gauss source with a square error of cr can be expressed by the following equation (8): ...(8) In the above equation, N refers to the node number of the whole network, cr refers to the square error of each Gauss source, which are related as follows: cr, > <J2 >...><7N ;. Dn and D respectively refer to each Gauss source's distortion degree and the total distortion degree after data fusion, which are related as follows: ...

Through equation (8), the rate distortion function, i.e. the bit number of data packets) can be worked out. Consider the following condition: the data dependency decrease as Gauss function, i.e. the farther two nodes are, the less their detected data packets' dependency is, which can be changed into f( Pi, pj )= (J W , with pi and py referring to two nodes in two dimensional space, WPj-PjW referring to Euclidean distance and W referring to the dependency between the two sample data packets in the space, which is less than 1. Here, cr is a constant (cr =1 is adopted here), referring to each sample data packet's square error within the measurement region.

In order to study the total network energy consumption, the following should first be made clear. 1) The average value of the total energy needed for the communication between non cluster-head nodes and cluster-head node; 2) The average value of the total energy needed for the communication between cluster-head node and sink node. As stated above, the average of ... is taken as the average hop of the data transmission from non-cluster-head node to cluster-head node, or from cluster-head node to sink node, in a cluster.

Now, let C0 represent a cluster-head node, II0 the assembly of all the non-cluster-head nodes that transmit data to C0, x, one member of II0, function f (x,) one property of node x" such as the distance from it to cluster-head node C0, Sf the sum of a certain property of all the non-cluster-head nodes f (xi) in the cluster, which is as follows: ... (9) In the above equation, 1 {.} is indicator function. If take A (C0) as the sensing region of a cluster, the expected value of S/ can be figured out through document [11], which is as follows: ... (10) In the above equation, the distance f (x) is the hop count r/R in the data packet transmission in the network. Suppose the radius of the network sensing region is Rnet, equation (10) can be changed into: ... (11) Definition Ci refers to the total energy consumed in the data transmission from non cluster-head nodes to cluster-head node in the network. It is supposed above that there are altogether N nodes in the network, each of which is selected as cluster-head with a probability of p, and that the network is divided into no sub-clusters and E (no) = Np. Then, the expected value of Q can be figured out as follows: ... (12) Definition C2 refers to the total energy consumed in the data transmission from cluster-head node to sink node in the network, with r, referring to the distance from the No. I cluster-head node to sink node, c, referring to No. I cluster and R D (c,) referring to the minimum information amount needed to represent the data detected by all the nodes in the No. I cluster when the distortion degree is D. Then, the expected value of C2 can be worked out as follows: ... (13) Suppose Ni is the number of nodes in cluster c" D, is the total distortion degree in cluster c" and D0 is the distortion degree of each of the nodes in cluster Cj, then there is Di = Ni D0 E[ri/R] can be worked out according to the specific distribution of the network, and rate distortion function E (RD (ci)) depends on the specific dependency function and probability density function (pdf) in cluster c,.

To sum up, the average of the total consumed energy in the whole network C, is as follows: ... (14) 6. Analysis of Network Lifetime In WSN, network lifetime and node energy consumption is closely related to each other. Network lifetime can be defined as the time of the network operation till the energy of the first node is used up. Suppose the communication (i.e. data packet transmission) among the nodes in the network is evenly distributed. Then, the energy of all the nodes in the radius of the central sink node will be used up nearly at the same time. As can be inferred from the above, the energy of the nodes in the first layer will be first used up. Moreover, they take a comparatively smaller percentage (about 1/Rne,) Obviously, if the energy of the nodes in the first layer is used up, the communication between other nodes in the network and the central sink node will be made impossible, thus bringing the network life to an end. Therefore, the whole network's life is the most closely related to the life of the nodes in the first layer.

Based on the above analysis, the peak value of energy consumption in the network can be predicted. Even the total energy consumption of each node e, can be divided into two parts: one is that consumed within the cluster of which a node is cluster-head or one member, which is referred to as e,". In the whole network, this part of energy consumption is the same with each node, because there is a same probability of being selected as the cluster-head for each node (the marginal effect is ignored). The other part is that averagely consumed in the data transmission to the central sink node, which is represented as eout (r). As is stated above, this part of energy consumption depends on the distance r to the central sink node. Suppose there are n," nodes in each sub-cluster. Then e," refers to the equal division of the total consumed energy of a cluster to each node in it, which is as follows: ... (15) ... (16) And the following can be worked out based on equation (11): ... (17) In the above analysis, it has been supposed that each node detects data packets in each time unit with a probability of 1. Certainly, in actual detection, it is possible that a node will detect an event with a certain probability and then transmit data packets. Obviously, if every node detects data packets with a probability of p0, their consumed energy should be multiplied by p0.

In the sub-cluster models, the second part of energy consumption comes from the transmission of compressed data packets from cluster-head node to the central sink node. According to the foregoing supposition, each node can be selected as the clusterhead node with a probability of p, and each clusterhead node has E (RD (c,)) bits of data packets to transmit, which means that the energy consumption curve is to be multiplied by the factor pE (RD (c,)). Moreover, in the foregoing, it has been supposed that Ec (r) refers to the expected energy consumed in transmission of one-bit data from the node that has a distance of r to sink node to the sink node. Thus, equation (18) can be established: ... (18) Based on equation (17) and equation (18), equation (19) can be derived: ... (19) As is stated above, the nodes whose energy is first used up are located in the boundary of the first layer. Here, r=l. Based on the definition of network lifetime, network lifetime is proportional to the reciprocal value of maximum energy consumption: ... (20) Equation (20) is defined as network lifetime factor. Through the calculation of equations from equation (14) to equation (20), the relationship between the total energy consumption and network lifetime can be predicted.

7. Simulation and Analysis The simulation environment is composed of the square region of 20R*20R, in which are evenly distributed 2,500 nodes, with the distortion degree DO being 0.01, the value range of the probability p of being selected cluster-head node of the nodes being [0.002, 0.5], and the value range of data dependency w being (0,1).

Relationship between Network Lifetime and Nodes' Probability p of Being Selected as Clusterhead Node.

Fig. 3 reflects relationship between network lifetime and nodes' probability p of being selected as cluster-head node. According to it, the analysis carried out by this paper accurately predicts the network lifetime performance, which is compared with the sub-cluster simulation results under the MHR agreement. The consumed energy of the nodes in the first several layers is determined by the number of data packets they transmit. There is a good match between the network lifetime analysis and simulation results. Therefore, even errors in path length will not seriously effect the energy consumption of the nodes in the first two layers.

As is analyzed above, when the data dependency w increase, the network lifetime will increase accordingly, which is shown in Fig. 3. Moreover, Fig. 3 also shows that when the scale of each subcluster is increased (i.e. the probability p of nodes' being selected as cluster-head decreases), the network lifetime will be lengthened accordingly but will slow down after passing a critical value, which is due to that the network used for the simulation is limited in scale, and there are limited data packets for the experiment when there are only a few clusters.

In accordance to the probability p of different nodes' being selected as cluster-head and data dependency, the relationship between network lifetime and total energy consumption can be illustrated by Fig. 4, in which the p values of all of the 19 nodes are respectively [0.002, 0.005, 0.007, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.12, 0.15, 0.2, 0.25, 0.3, 0.5], with p increasing in accordance with the arrow direction in the figure.

From Fig. 4, the simulation results dovetail nicely with the analysis of this paper, which suggests that a relative trade-off can be achieved between energy consumption and network lifetime. However, it's certain that, in practical application, if the sensor node energy is not regenerative, network lifetime will be more stressed.

From Fig. 4, we can also see that, for all w values, the curve of energy consumption and network lifetime is similar in shape, except that the optimal cluster-head selection probability p is the function of data dependency w. For example, when w equals to 0.99 and p reduces from 0.05 to 0.005, the network lifetime will be increased by about three times while the energy consumption is increased by less than two times. When w equals to 0.5 and cluster-head selection probability p reduces from 0.05 to 0.005, the network lifetime will be increased by less than 80 % while the energy consumption is increased by about 70 %.

8. Conclusion and Prospects When data packets are transmitted through MHR, the energy consumption of the sensor nodes largely depends on the nodes' locations. Although some nodes are relatively distant from sink node, they consume more energy than those nodes close to sink node. As for the energy consumption of the network, within a sub-cluster, the minimum bit amount of the data packets to be transmitted will be described through rate distortion function while limited by certain distortion rate. According to the research, there is a trade-off between the whole network energy consumption and the network lifetime, i.e. the minimum total network energy consumption does not necessarily mean the longest network lifetime.

Currently, energy-saving mechanism research has become one of the key directions of WSN research. The following aspects are worth deeper exploration.

1) Trans-layer design: For example, there can be a combined consideration of both MAC layer and Routing layer, and the nodes working conditions can be determined according to the network topology information. Since more information can be achieved through trans-layers than MAC layer, there will invariably be better global performance and great progress will be made in energy saving.

2) Adoption of different nodes: in the clustering algorithm, cluster-head node is supposed to fulfill data fusion and data packet transmission. Therefore, it will consume more energy. Moreover, energy consumption depends on the node location in the network. Taken in this sense, the high-energy nodes should be put on the nodes that will consume relatively more energy in the network, whereby the network lifetime can be effectively prolonged.

3) Optimization of handshake signals: The handshake signals, such as RTS/CTS, are very widely applied in sensor network. However, there will be extra expenses for them on one hand and conflicts among them on the other. Therefore, it can both save energy and better improve the network performance to reduce and optimize the handshake signals.

Acknowledgements Foundation Items: The Natural Science Foundation of Jiangsu Province (BK2012584); The Natural Science Foundation of Changzhou (CJ20110025).

References [1] . Cui Li, Ju Hailing, Li Tianpu, et al., Progress in WSN research, Computer Research and Development, Vol. 42, No. 1,2005, pp. 163-174.

[2] . Li Jianzhong, Gao Hong, Progress in WSN research, Computer Research and Development, Vol. 45, No. 1, 2008, pp. 1-15.

[3] . Sun Limin, Li Jianzhong, Chen Yu, et al., Wireless sensor network, Tsinghua University Press, May 2005.

[4] . Ren Fengyuan, Huang Haining, Lin Chuang, Wireless sensor network, Software Journal, Vol. 14, No. 7, 2003, pp. 1282-1291.

[5] . A. D. Amis, R. Prakash, T. H. P. Vuong, D. T. Huynh, Max-min DCluster formation in wireless ad hoc networks, in Proceedings of the IEEE INFOCOM, March 2000, pp. 32^11.

[6] . B. Liang, Z. Haas, Virtual backbone generation and maintenance in ad hoc network mobility management, in Proceedings of the IEEE INFOCOM, March 2000, pp. 1293-1302.

[7] . C. F. Chiasserini, I. Chlamtac, P. Monti, A. Nucci, Energy efficient design of wireless ad hoc networks, in Proceedings of European Wireless, February 2002.

[8]. M. Chatteijee, S. K. Das, D. Turgut, WCA: A weighted clustering algorithm for mobile ad hoc networks, Journal of Cluster Computing, No. 5, 2002,pp. 193-204.

[9]. S. Bandyopadhyay, E. J. Coyle, An energy efficient hierarchical clustering algorithm for wireless sensor networks, in Proceedings of the IEEE INFOCOM, Vol. 3, March/April 2003, pp. 1713-1723.

[10]. S. G. Foss, S. Zuyev, On a voronoi aggregative process related to a bivariate poisson process, Advances in Applied Probability, Vol. 28, No. 4, 1996, pp. 965-981.

[11]. A. Scaglione, Routing and data compression in sensor networks: Stochastic models for sensor data that guarantee scalability, in Proceedings of the International Symposium on Information Theory ISIT'2003, June 29 - July 4,2003.

[12]. Lin Shen, Xiangqun Shi, A dynamic cluster-based key management protocol in wireless sensor networks, International Journal of Intelligent Control and Systems, Vol. 13, No. 2,2008, pp. 146-151.

[13]. F. Tang, M. Gao, M. Li, et al., Secure routing for wireless mesh sensor networks in pervasive environments, International Journal of Intelligent Control and Systems, Vol. 12, No. 4, 2007, pp. 293-306.

[14]. M. Yu, A. Malvankar, W. Su, An environment monitoring system architecture based on sensor networks, International Journal of Intelligent Control and Systems, Vol. 10, No. 3,2006, pp. 201-209.

1 Shen LIN,2 Wang LIQIN 1 School of Electrical and Information Engineering, Jiangsu University of Technology, 213001, China 2 Department of Automatic Control Engineering, Changzhou College of Information Technology, 213164, China E-mail: [email protected] Received: 22 October 2013 /Accepted: 22 November 2013 /Published: 30 November 2013 (c) 2013 International Frequency Sensor Association

[ Back To TMCnet.com's Homepage ]