TMCnet News

DRM Algorithm for Irregular 2D Mesh [Sensors & Transducers (Canada)]
[April 22, 2014]

DRM Algorithm for Irregular 2D Mesh [Sensors & Transducers (Canada)]


(Sensors & Transducers (Canada) Via Acquire Media NewsEdge) Abstract: In this paper, routing algorithm has been mainly studied. The main work and contribution have been generalized as follows: Through the research of Internet and NoC's (Network on Chip) deterministic and adaptive routing, DRM (Dynamic Routing Mesh) algorithm which is combined deterministic and adaptive routing algorithm has been designed, which not only ensures connectivity of any couple of communication nodes but also just requires only two virtual channels. Furthermore, the DRM routing algorithm is for solving the irregular 2D mesh topology, and the method has been designed which is used for the choice of path. Simulation results show that, DRM routing algorithm compares with Boppana and Virtual Network algorithm has certain performance advantages. Copyright © 2013 IFSA.



Keywords: DRM algorithm, 2D mesh, Virtual Network algorithm, Performance advantages.

(ProQuest: ... denotes formulae omitted.) 1. Introduction Recently, the technology called NoC has aroused wide public concern. The reason for NoC has been designed is that with the steady growth of integration of SoC (System on Chip), and the shortcomings of the traditional bus architecture have been exposed gradually, such as bus bandwidth, global synchronization and so on. So NoC has been designed to solve the problem which is caused by development of SoC. Routing algorithm determines the path by which the packets are sent. The choice of the path will produce a significant impact on network throughput, delay, quality of service and so on [1]. Mapping algorithm is decided that each processing unit in the NoC position according to the different purpose of optimizing, and it will have a significant impact on NoC system's power consumption, delay, area, load balancing, etc [2].


NoC main research directions are: 1) Router structure is based on the NoC communication unit, and the respects of research are mainly to discuss the technical details of the cache in the router and switch structure to ensure that the specific needs of the NoC, such as power, area, etc.; 2) The NoC topology is researched, such as 2D mesh, spin, 2D Torus, Folded Torus, etc.; 3) The NoC mapping algorithm is researched, and the mapping algorithm is a decision to the actual IP core to given topological location mapping algorithm of the program; 4) The NoC exchange mechanism is researched; 5) The routing algorithm based on its performance as reflected in the different focus is divided into for the quality of service QoS routing algorithm and fault-tolerant routing algorithm [3-6].

Since 2000, many research institutions have proposed a variety of innovative on-chip network architecture, research institutions, including Bologna, KAIST, KTH, LIP-6, MIT, UCSD, Manchester, Stanford, Tampere, Technion, Philips Research Laboratories, ST Microelectronics, and VTT Technical Research Center. Some chip manufacturers (Intel, AMD, IBM, etc.) are also committed to research in this area, and Intel Corp. researchers have developed the world's first programmable processor with supercomputer performance on-chip network architecture, and each chip contains 80 cores and 100 million transistors emitting 98 watts heat [5]. The size of the equivalent is a nail size, and the power consumption is lower than the current vast majority of household appliances. This is the outcome of the company's innovative "Tera-scale computing" research project, and the research project aims to provide one trillion floating point performance for future PC and server, and one trillion operations per second. At the same time, IBM Power4 and Power5 are a typical switching network on-chip communication network chip [6]. The latest IBM Cell processor also uses a new type of ring network. On-chip network is the chip industry trends, and such chips will play a key role in the computer, which can access the Internet and creates new applications for education and collaboration, and promote based on personal computers, servers and handheld devices innovative high-definition entertainment. Therefore, accelerating the development of on-chip network technology in China's chip industry will be starting to a huge boost.

2. Description of the Problem On-chip design of network routing algorithms depends on the topology of the set, and most of the NoC research and design draw on the parallel computer architecture in the static network structure. In the current topology applications, 2D mesh [7] for its simple structure, scalability, ease of implementation and analysis, etc., has been widely used in the NoC area, such as Nostum, AEthereal, SOCN, SoCBus and Hemes. At the same time, many algorithms based on on-chip network algorithm research are based on the 2D mesh topology structure. The development of this structure is for the NoC, and the application tasks are assigned to the appropriate IP, then IP is mapped to the corresponding topology.

In the specific practical application, the various IP cores designed according to the needs may be the different functions they realized its size, which is not the same, namely, the IP cores of various sizes. To solve this problem, irregular 2D mesh structure is proposed and widely applied to the NoC design.

In the irregular 2D mesh structure, any one of the IP core allows you to override one or several grid cells of 2D mesh grid, which constitutes the irregular 2Dmesh topology. The irregular 2D mesh structure Noc is shown in Fig. 1. Among them, black and white nodes represent the irregular 2D mesh communication nodes to the lower left comer of Fig. 1 Cartesian coordinates for the origin of coordinates, and the graph nodes are located in the first quadrant of the coordinate system, and node (2,4) for IP core module is communication node, and the nodes located in the left want the right node communication, which is sure to bypass the isolated nodes in Fig. 1, which is non-existent communication node, and its location are the IP core occupied.

In this topology, the routing algorithm of the rulebased 2D mesh topology will not be used due to some problems, such as: 1) Deterministic routing algorithm: hardware design is simple for less network traffic with a small delay, which can prevent deadlock and livelock, but connectivity can not be guaranteed. 2) Random routing algorithm: to ensure connectivity, but the algorithm will cause a lot of redundancy in the network transmission and network resources, which will be a lot of overhead. 3) Adaptive routing algorithm: adaptive routing algorithm based on the shortest path can not guarantee its connectivity, and its turn restriction may lead in some cases that can not bypass the turn model-based adaptive algorithm, which is difficult to ensure connectivity. Meanwhile, the current deadlock routing algorithm is also very little for the irregular 2D Mesh network. Therefore, this paper proposes the irregular 2D Mesh routing algorithm for various shapes irregular 2D Mesh, and this algorithm can maintain routing connectivity and deadlock-free, and resource utilization and IP communication hops are optimized.

3. Relevant Definitions and Theorems Definitionl. Interconnection network I is a strongly connected directed graph I = G (N, C), where N is the set of all nodes in I, C is the collection of all nodes in the output channel, and the tail node and the head node of the channel ci e C are expressed as the tail ( c,. ) and head ( c,. ).

Definition 2. Given interconnection network I, routing algorithm R, and two channels ci ,Cj e C , if there is a path, the message immediately enters Cj after leaving c,., which called c,. to cy channel dependence exists.

Definition 3. Given interconnection network I and routing algorithm R, R channel dependency graph is a directed graph D = G (C, E), where C is a collection of all the channels I, and E is all channels dependence of channel pairs in I, that is, c(., cy e C, if ci to c} exists channel dependence, the channel <ci,c}> e E.

Theorem 1. The consecutive routing function R of Internet I is deadlock-free necessary and sufficient condition: there exists a routing subfunction Q, and it is connected and there is no loop in its extended channel graph.

Inference 1. A (deterministic) routing function R of Internet I is deadlock-free if and only if there is no loop in the channel graph.

Consider a unidirectional ring with four nodes, nodes denoted , i = 0, 1, 2, 3 has a one-way channel connection between each pair of adjacent nodes. Make ci, i = 0, 1, 2, 3 for the nodes in the output channel. In this case, define the routing function is: If the current node nt is equal to the destination node n} , and the message is stored. Otherwise, use ci, Vy * i. The network is shown in Fig. 2, and ci channels exists the circular correlation. In fact, node n0 has a packet destination node n2, which can retain c,, and then request c2 . Node nx has a packet destination node n3 , and you can reserve c, , and then request c2 . Node n2 has a message destination node is n0, and you can keep c2, and then request c3. Node n3 a packet destination node is nx ,and you can keep c3, and then request c0. It is easy to see that the deadlock in the packet containing the above-mentioned configuration, because each packet retains their own channel and waits for another packet occupies the channel. Thus, the sequence of XY dimensional routing through two of the four turn makes the formation of ring to be prohibited to the effective prevention of deadlock.

However, in the adaptive routing algorithm, each node of the packets usually has multiple choices, even if the selected channel may be other packets occupy other routing can also be used. And therefore do not have to eliminate all the cyclic correlation, as long as each packet can always find a path all the channels are not involved in the cycle.

Consider a unidirectional ring with four nodes, nodes denoted as ni, i = 0, 1, 2, 3, in addition to the connection outside a channel between the nodes n3 and n0, and each pair of adjacent nodes have two channels connected. So that cAi, i = 0, 1, 2, 3 and cm, i = 0, 1, 2 the node ni is the output channel. The routing function is as follows: if the current node nj is equal to the destination node /?,. in the packet stored. Otherwise use cAi, Vy -ti or cHi, Vy > i. The network is shown in Fig. 3.

From the above charts can be found by joining the virtual channel to get rid of ring formation, thus avoiding the deadlock. Therefore, even if some channels between the circular correlation but can not be all, you also avoid deadlock.

4. Algorithm is Described This article for irregular 2D mesh proposed algorithm is called the DRM algorithm, and the algorithm is designed mainly for the solution of two problems: 1) Ensure that any two nodes connected in a deadlock, livelock conditions; 2) How to select the bypass path; the irregular mesh routing needs to bypass the larger IP blocks and to maintain its connectivity, and sometimes in the middle of the source node and destination node, one or more largescale IP module, bypass program four cases (Fig. 4), respectively, as follows: bypass in the positive x direction(module A ), bypass in the negative x direction(Module B), bypass in the positive y direction (module C), bypass in the negative y direction (module D), and ABCD represents a single or multiple IP blocks composed of equivalent IP module, and module G is composed of the module E, F in shown in Fig. 4.

DRM routing algorithm proposed in the paper does four bypass turning program shown in Fig. 4, and the solid line on the Y-direction arrow is on behalf of the virtual channel yl and y2, and dashed arrows represent only allows virtual channel y2 shown in Fig. 5.

DRM routing algorithm is described in two ways, on the one hand, the arbitration function F of the bypass direction, for example, module A selects from above bypass or from below bypass, and the function F will determine the shortest distance to bypass program; The bypass turn the program is shown in Fig. 5.

4.1. Detour Directions to Select the Function F First, large IP blocks around the IP module, communication module is set to mark the module, as shown in Fig. 1 white module, class module contains the coordinates and size of large IP information, and set the identifier Z = -1, +1,-3, +3, 2, respectively, the IP module in the y-, y +, x-, x + at the edge of the network, which can not bypass and the around can bypass. Set 2D mesh topology x direction of the O-dimensional, y is 1-dimensional, si corresponding to the communication module is coordinate(x,.,y,), and source address coordinate is (xr,yr) , the destination address coordinate is (xd,xd) , surrounded by the tag IP module size is m * n corresponding to a O-dimensional and 1-dimensional. The following discussion of function F for the four bypass program is shown in Fig. 4.

1) Suppose sr need to bypass the larger IP blocks from the x + direction k arrive sd , with Fig. 4 module A, the IP module communication module coordinate (xs,ys), size m*n, identifier Z=2. Set Fa to bypass the direction selection function, Lx is the bypass of the distance from the y + direction, L2 bypass from the y-direction the distance.

... (1) Come to the bypass program by the comparing of function (1) value and 0: ...

2) Assume that Module B and Z = 2 in Fig. 4: ... (2) Come to the bypass program by the comparing of function (2) value and 0: 3) Assume that Module C and Z = 2 in Fig. 4: ... (3) Come to the bypass program by the comparing of function (3) value and 0: ...

4) Assume that Module D and Z = 2 in Fig. 4: ... (4) Come to the bypass program by the comparing of function (4) value and 0: 5) Suppose that Z = -1, +1, -3, +3, only a detour direction without calculation.

4.2. Deadlock-free Bypass Program Set channel for the virtual channel number, and the virtual channel number is 0, lin the y direction of DRM algorithm. X + channel = null represents the current node in the IP module in the X+ direction.

The O-dimensional piece of data is the relative displacement as follows: ... (5) The 1-dimensional piece of data is the relative displacement as follows: ... (6) In introducing DRM, first introduce the opt - y [8] algorithm for the two-dimensional network, Linder and Harden presented a double-y algorithm, and the algorithm requires the first dimension and second dimension on each physical channel has two virtual channels. If the dimensions are swapped, then the new algorithm will require the X dimension and Y dimension of the physical channel that has two virtual channel of a river. Glass and Ni in order to turn the model-based analysis of the double-y routing algorithm, to eliminate unnecessary restrictions and launched the largest adapt mad-y routing algorithm. Compared with the double-y routing algorithm, the mad-y algorithm improves adaptability. The mad-y algorithm does not introduce channel between central available under the relevant conditions of adaptive degrees, but the ring does not necessarily lead to deadlock, so mad-y algorithm can be improved. The improved mad-y algorithm proposed algorithm applies to two-dimensional grid by Schwiebert and Jayasimha [9], and the algorithm requires the first dimension and second dimension on each physical channel, which has one and two virtual channels, and the different turn situation only allows to use yl(ydimensional displacement corresponding to 0 virtual channel), y2 (y-dimensional displacement corresponding to 1 virtual channel ), opt-y algorithm by Theorem 1 can prove that no deadlock exists [10]. DRM algorithm will apply the idea of opt-y algorithm in the irregular 2D mesh turn rules and inherit deadlock-free opt-y algorithm, which will be their turn to rule the four turn of the classification selected in Fig. 5 programs. Specific description is as follows: ...

5. Algorithm Simulation 5.1. Simulation Platform This article will use the proposed algorithm and 2D Mesh network fault resistance algorithm proposed by Boppana [11], and the improved algorithm proposed in [48] based on the irregular 2D mesh algorithm (in this article called Virtual Network algorithm) is compared. Mesh networks proposed by Boppana resistance fault algorithm requires four virtual channels to deal with the tum in the literature algorithm needs three virtual channels to form four kinds of virtual network in order to satisfy the deadlock-free turning. Therefore, in order to be fair, the simulation will be using four virtual channels for simulation.

This article uses the OPNET network simulation software for simulation [12], and the simulation uses 8 * 8 2D Mesh network structure, and each router node has P +1 input ports, and each port has a V +1 buffer queue corresponds to V +1 virtual channels. At the same time there are P +1 output ports and no output cache. In the simulation setting p = 3, respectively, comes from the input and output of the four adjacent nodes, each router node exchanges data between a pair of input and output channels, which is a router node local resources node channel.

In the simulation, set the cache of each channel for a packet size. Grouping the microchip can be divided into 1) Head microchip: micro-chip includes the routing information; 2) Data microchip: a packet is cut during the head microchip behind the bear with a lot of data packet information data; 3) End microchip: the last piece of data packets transmit to mark the end microchip.

The resource node module is responsible for producing the local business and injected into the source and the cache waiting to be routed, the destination node for the local node group is sent to the local processing module. Piece of data to a destination node routing process, after an intermediate node, first of all to be routed computing, virtual channel allocation, if the requested queue is empty and the queue is the non-locked state (queue in the transmission head microchip behind the data is locked), which can be transmitted through arbitration allocation, and the arbitration mechanism adopts the rotation priority arbitration (RR) arbitration mechanism, and this mechanism can be well guaranteed fairness, effectively prevent the occurrence of starving to death. It will all apply for this queue to be selected as the grouping of the output queue, and RR arbitration mechanism is selected, and the data transmission advances to the next router node as determined by the routing function.

The exchange mechanism of network simulation using wormhole switching mechanism, the link rate is set to 10 kbps. Packet length distribution adopts typical length distribution SP (Size and Percent). From the IP packet length in the actual network statistics [13], using the four most typical of the length of the result and the ratio: 40 byte length grouping account for 56 % of all group accounted for 23 % of the length of 1500 bytes, 576 bytes length accounted for 16.5 %, and 52 bytes in length accounted for 4.5 %.

The simulation flow model uses three traffic patterns of the simulated network traffic distribution in the literature: uniform flow patterns, hot spot flow patterns and tomado flow patterns. In uniform flow patterns, the probability of each node sends packets to other nodes; In the hot spot flow patterns, some nodes are set to hot spots compared to the other nodes and it receives more data in the simulation 15 % of the traffic than other nodes. The tomado flow refers to the node i packet sent to the node (i + k/2-1) modk, and k=8 is the network diameter in this simulation.

In the performance comparison, the source node in the network to the destination node, the average endto-end delay, normalized network throughput is measured. End-to-end delay of the so-called source node to destination node refers to the packet generation time and the packet reaching the destination node, which will be marked difference between the time of the network. The normalized network throughput is the throughput of the network and the ratio between the maximum throughputs of the network theory. Thus, the smaller the average end-to-end delay is, the greater the normalized throughput is , which indicates that the network performance is the better.

5.2. Simulation Results and Analysis First, under uniform flow as shown in Fig. 6, when the network load is low, the delay and throughput of three algorithms are very close, but with the improvement of network load, the gap between the three is gradually emerging, and DRM algorithm by virtue of the path selection mechanisms and the efficient utilization of virtual channels shows a greater advantage in the case of a higher degree of network busy.

Single hot spot model shown in Fig. 7, this simulation is set in the vicinity of the large IP blocks, which more reflects the DRM algorithm for the detour to optimize the reflected path distribution of advantages in this model due to Boppana algorithm and Virtual Network algorithm for imaginary letter to underutilization as well as further enhance the gap with the DRM algorithm on the delay and throughput for lack of optimization shortcomings of the path allocation.

Tomado flow pattem as shown in Fig. 8, due to the delay performance and throughput performance of DRM algorithm also is ahead of the Boppana algorithm and Virtual Network algorithm, which is mainly due to the DRM algorithm by virtue of its excellent path selection mechanism and efficiency of the virtual channel utilization. But the DRM algorithm in non-bypass state routing idea is still dimensional sequence routing, thus making a considerable degree of congestion will also appear in the high busy state of the network. But the simplicity of hardware design and implementation of the algorithm makes the algorithm has a very high costeffective.

More need to explain, in NoC design, size and power consumption are one of its design considerations. According to the research, the proportion of the area and power consumption of the buffer in all NoC components are the highest, and fewer buffer means lower power consumption and less chip area. The numbers of virtual channels needed in each algorithm are shown in Table 1. DRM algorithm, compared with the Boppana algorithm and Virtual Network algorithm uses fewer buffers. Boppana algorithm provides four virtual channel requirements, and Virtual Network algorithm provides three virtual channel requirements, and each virtual channel will use a packet size of the cache, the DRM algorithm only requires two virtual channels realized. Therefore, the DRM routing algorithm to achieve low cost and low power NoC application provides a better solution.

6. Conclusion The irregular 2D mesh topology inherits the simple structure, good scalability, ease of implementation and analysis of the changed 2D mesh. At the same time, it better suit the actual situation in a number of practical applications, and the NoC system usually requires integrated a number of different functions, different sizes, different components of the communication needs, and the topology of the rules is not adapted to this type of NoC application, so the irregular 2D Mesh network is applied to irregular NoC system. However, this topology is applied at the same time, and new problems result in the rules of 2D mesh topology routing algorithm, which can not guarantee the connectivity between any two communication nodes in this topology. By reading the relevant literature, the topology structure of the irregular 2D mesh routing algorithm has been proposed for this, but all aspects of its performance and more demand on the virtual channel limit the usefulness of these algorithms. In this paper, DRM algorithm is proposed, in the bypass of large IP blocks, the algorithm introduces a new path options to ensure that irregular IP module orbiting the better route. Effectively reduce the delay, while the algorithm compared with other algorithms uses fewer virtual channels. Each virtual channel will need to allocate a packet size of the memory as a storage space, so less virtual channel means the needs of the smaller storage module, which will effectively reduce the power and area of the NoC system. This paper compared with other algorithms is more practical, more efficient virtual channel utilization and better path to choose to improve the throughput and delay performance, which is verified by simulation.

Acknowledgement The work was supported by the fund of Collaborative Innovation Center of Tongling University.

References [1]. L. Benini, G. De Micheli, Network on chips: a new SoC paradigm, IEEE Computer, Vol. 35, Issue 1, 2002, pp. 70 - 78.

[2]. A. Jantsch, H. Tenhunen, Networks on Chip, Boston, Kluwer Academic Publishers, 2003, pp. 3-18.

[3]. T. Bjerregaard, S. Mahadevan, A Survey of Research and Practices of Network-on-Chip, ACM Computing Surveys, Vol. 38, Issue 1,2006.

[4]. W. J. Dally, B. Towles, Route packets, not wires: Onchip interconnection networks, in Proceedings of Design Automation Conference, 2001, pp. 684-689.

[5]. P. Guerrier, A. Greiner, A Generic Architecture for On-Chip Packet-Switched Interconnections, in Proceedings of the Conference on Design, Automation and Test in Europe, 2000, pp. 250-256.

[6]. Karim F. et al, An Interconnect Architecture for Networking Systems on C hips, IEEE Micro, Vol. 22, Issue 5, October 2002, pp. 36-45.

[7]. M. K. F. Schafer, T. Hollstein, Deadlock-free routing and component placement for irregular mesh-based networks-on-chip, in Proceedings of the IEEE International Conference on Computer-Aided Design, November 2005, pp. 238-245.

[8]. S. Yalamanchili, L. Ni, Interconnection Networks An Engineering Approach, Publishing House of Electronics Industry, 2004, pp. 58-148.

[9]. D. Wu, B. M. Al-Hashimi, M. T. Schmitz, Improving routing efficiency for network on chip through contention-aware input selection, in Proceedings of the Conference on Asia South Pacific Design Automation, 2006.

[10]. L. Schwieber, D. N. Jayasimha, Optimal fully adaptive wormhole routing for meshes, in Proceedings of the Supercomputing'93, November 1993, pp.782-791.

[11]. M. Dehyadgari, M. Nickray, A. Afzali-Kusha, Evaluation of Pseudo Adaptive XY Routing Using an Object Oriented Model for NoC, in Proceedings of the 17lh International Conference on Microelectronics, 2005, pp. 204-208.

[12]. V. Rantala, T. Lehtonen, J. Plosila, Network on Chip Routing Algorithms, Technical Report, No 779, University of Turku, August 2006.

[13]. T. T. Ye, L. Benini, G. De Micheli, Packetization and routing analysis of on-chip multiprocessor networks, Journal of Systems Architecture, 2004, pp. 81-104.

Wang LIPIN, Wu HUI Engineering Technology Research Center of Optoelectronic Technology Appliance, An Hui Province, Tongling, 244000, AnHui, China E-mail: [email protected] Received: 10 October 2013 /Accepted: 22 November 2013 /Published: 30 December 2013 (c) 2013 International Frequency Sensor Association

[ Back To TMCnet.com's Homepage ]