Disclaimer: This dissertation has been written by a student and is not an example of our professional work, which you can see examples of here.

Any opinions, findings, conclusions, or recommendations expressed in this dissertation are those of the authors and do not necessarily reflect the views of UKDiss.com.

Network-on-Chip (NoC) Architecture Total Power Dissipation

Info: 20932 words (84 pages) Dissertation
Published: 11th Dec 2019

Reference this

Tags: EngineeringEnergy

ABSTRACT

The technology increase in the present world even though; a few of the power dissipation problems are thinning the technology in the field of NoC.  This power dissipation was mainly seen in the additional elements of the communication system, which are nothing but the routers and network interfaces. In our project, we were emergent the data encoding and decoding schemes to lessen the power dissolute by the links of a NoC. The anticipated design was implemented using Verilog HDL for the proposed schemes, which let us to saving power dissipation and energy utilization without any influence on performance degradation.

In this project, we are proposing two different schemes to decrease the power in the links of the NoC. Here in this project we are reducing coupling switching activity and the normal switching activity to reduce the total power dissipation in the links of the NoC by using different encoding schemes. By these proposed schemes, the consecutive bits are taken care not to have opposite values so that coupling switching activity is reduced. Similarly, the bits passed through the particular links are encoded in such a way that toggling (opposite previous and present values) of the bit values in that particular links is prevented. In this way normal switching activity is reduced.

 

 

 

 

 

 

CHAPTER 1

INTRODUCTION

1.1. INTRODUCTION

Every technology has facing some problems like power dissipation, energy problems etc. In VLSI technology the wire densities increases to support every small transistor geometries and then it leads to energy and power problems. By rising the delay between on chip unit will get high latency of cross-chip communication system can still limit the total performance. By using on-chip packet-switched of interconnects are generally known as Network-on-Chip (NoC) architecture, we satisfied scalable bandwidth requirement. The traditional large-scale multi-processors and distributed computing network leads to get the basic idea to the NoC-based system implementation.

In order to meet up typical SoC of network interconnection like switching logic and the packet definition should be light-weighted to get easy performance solutions. Another way to exceed such limitation of communication and overcome wiring delay in future.

As mentioned the basic concept of such type of interconnections is from the modeled compute ring network evolution. By applying network-like communication system which inserts some routers in-between each communicating object, the required wiring can be shorten. Therefore, switch-based interconnecting mechanism provides a bunch of scalable and freedom from the restricted of complex wire. Replacement of SoC bussing by NoCs will follow the same path of data communicating when the economics proving that the NoC either reduces SoC manufactured cost, SoC time to marketed, SoC time to volume, and SoC designing hazard or increase SoC, performances. According to the NoC approaches has a clear advantaging over traditional bus and most notable system throughput. And hierarchies of crossbars or multi-layering buses have characteristic somewhere in between traditional buses and NoCs, however they still fall far short of the NoC with respect to performance and difficult. The success of the NoC designing on the research of the interfaces between processed elements of NoC and interconnection fabric.

The interconnecting of a SoC recognized procedures has some points in those respected of sluggish bus response time, scalability problem, and energy, bandwidth restrictions. Bus interconnection composed of a large number of components in a network interfacing can root slow interface time though the influence of distributing the bus.

In addition the interconnection has a drawback, which is power consumption is high on the scouring of connecting all objects in the communication system. Moreover it is impractical to increasing the number of connection of the elements infinitely by reason of the limitation of bandwidth in a bus. In fact, the communicating subsystem increasingly impacts the traditional design objectives, together with cost, performances, power dissipation, energy consumption etc.

In this project, a low-power data encoding scheme is our proposed system. In general, system-on-chip (soc) based systems has so many disadvantages in power dissipation as well as clock rate wise, the such transfer the data from one system to another system in on-chip. A higher operating system does not support the lower operated in the bus network for data transfer information. However an alternative system scheme is proposed for high speed data transfer in system. But this scheme is limited to SOCs. Unlike SoC, network on- chip (NOC) system has so many advantages for data transfer techniques. It has a special feature to transfer the data in on-chip network named as transitional encoder. Its operation is based on transitions of input data. At the same time it which is operated at higher frequencies performance. The proposed system yields lower dynamic powered dissipation due to the reduction of switched activity and coupling switched activity in the existing system when compared to existing system. Even-though many factors which are based on power dissipation, the dynamic power dissipation is only substantial for reasonable improvement.

1.2. EXISTING SYSTEM AND ITS DRAWBACKS

Mainly data encoding techniques are classified into two groups. In the first group, encoding techniques are focused on lowering the power due to self switched activity of individual bus lines while ignoring power dissipation due to their coupling switching activity.

In this group, bus invert (BI) and INC-XOR were proposed in the case of random data patterns are transmitted through these lines. On the other hand, T0, working-zone encoded and T0-XOR were suggested for the case of correlate data patterns. Here we were also projected Application-specific proposals. This group of encoding is not suitable to be applied for deep sub micron meter technology nodes. Where the coupling capacitance comprises major part of the total interconnect capacitance of the link. Drawbacks of present existing system are:

  • The area of the chip will increase
  • line-to-line spacing also increased
  • consumption of the power increments

 

 

1.3. PROPOSED SYSTEM

In the proposed system, the main goal of our approach is to reduce the power dissipation in the network links. The main cause of the power dissipation is the normal switching activity and the coupling switching activity. To reduce those activities in those links we are proposing the data encoding techniques, before they are injected into the network links such that opposite values (0 and 1) not coming side by side of the two neighboring links.

Besides, the design of data encoding schemes, we are also designing the simpler decoder circuit.

 

 

 

 

 

 

 

 

 

 

 

 

 

CHAPTER 2

LITERATURE SURVEY

2.1. INTRODUCTION

Most of the energy in traditional SoCs are dissipated in data buses and long interconnects due to dynamic power consumptions during charging and discharging of internal node capacitances as well as inter-wire capacitances. Crosstalk is dominated by inter-wire capacitance during charging as well as discharging.

The serial links in network on chip, NoC architectures can provide savings in the power dissipation, reduction of wire area, reduction of noise, simpler layout and timing verification, and controlled throughput by adjusting the frequency of the serialize. It also eliminates line drivers and buffers.

Disadvantages of serial communication, such as inter symbol interference between successive and high speed operation, can be appropriately handled by proper encoding and asynchronous protocols.

As a consequence, the performance of the network on chip NoC design relied greatly on the interconnection paradigms. Though the network technology in computer networking is already well developing, it is almost the impossible chip-level intercommunication environmental without any alteration or reduction. For that analysis, many researchers are trying to developing appropriate network architectures for on-chip communication. To be eligible for NoC architecture, the basic functionality should be light-weighted because the implementing component of NoC architecture should be small enough in the form of the basic technique to be a basic component constructing a SoC. In order to be low powering one has to consider many parameters such as clock rate operating voltages, power management.

                                     Fig.2.1: Fundamental concept of NOC

As shown in Fig the NI is augmented with an encoder (E) and a decoder (D) block. With the exception of the header flit, the encoder encodes the outgoing flits of the packet in such a way as to minimize the by the inter-router point-to-point links which form the routing path of the current packet, which have to be processes by the routers through the routing path. Similarly to the above explanation, all the incoming flits in the network interface (with the exception of the header flit) are decoded by the encoder block.

Through the network lines, we are transmitting the random data patterns; in case of Bus inverting (BI) and INC-XOR have been projected. But unfortunately, this Bus Inverting and INC-XOR encoding schemes are not suitable for deep sub micrometer technology. In the deep sub micrometer technology, the coupling capacitance will play the vital role in total interconnect capacitance. This criterion leads that in the total power consumption, the coupling switching transaction activity becomes the larger part of power consumption. But in the above mentioned techniques, we just ignoring the coupling switching activity contribution in the total power consumption. So, the above mentioned techniques are inefficient to reduce the total power consumption in the network lines. Coming to the second category of techniques, in which we are more focused on reducing the power dissipation by reducing the coupling switching, but using more and more control lines. For example, the width of the data bus increases from 32 to 55 inches. By increasing the widths of the data bus, it is more difficult to design the decoder circuit. The method is described as follows: initially, the data are both odd inverted and even inverted, and the inverted data was transmitted using the inversion that minimizes more the switching transaction. In this project, compared to the existing system, we are using easiest encoder and decoder while getting the higher switching activity reduction. In this technique, we should count the number of transitions from 0 to 1 for two neighborhood flits (the flit is nothing but that just traversed and the one which is about to travel the link). If that number is more than the half the size of the total link, then we will perform the inversion in such a way that the number of 0 to 1 transitions before the flit is transferred via the link. This type of scheme will concentrate on the self-switching without bothering about the coupling switching. Note that the coupling capacitance in the up to date silicon technology is considerably larger compared with the self-capacitance, and should be considered in any scheme proposed for the link power reduction.

2.2. BACKGROUND TECHNOLOGIES

As VLSI technologies continue to scale, wire densities increases to support ever-small transistor geometries and causes on-chip wires to present increasing latency and energy problem. In particular, the high latency is existed in the encoder total performance by increasing the delay between on-chip units. Such requirement can be satisfied by using on-chip packet-switched micro-network system of interconnects, generally known as Network-on-Chip (NoC) system architecture. The basic idea is communicating from the traditional large-scale multi-processors to distributed computing networks. The scalable of NoC and their support for efficient on-chip communication lead to the NoC-based system implementations. In order to meet typical SoCs or multi-core processing and basic module of networking interconnection like switching logic, routing algorithm and the packed definition should be light-weighted to result in easily implemental solutions. Another approached to exceed such a limitation of communication system and overcome such an wiring delay in future technology is to adopt network-like interconnections which is called Network-on-Chip (NoC) system architecture. Basic concept of such kind of interconnections is from the modern computer networking evolution as mentioned before. By applying network-like communication system which inserts some routers and links in-between each communication object, the required wiring can be shortened. Replacement of SoC busses by NoCs will follow the same path of data communications when the economics prove that the either reduces SoC manufacturing costly, SoC time to marketing, SoC time to valuing, and SoC design risk or increases SoC performances. According to the NoC approach has a clear advantage traditional busses of the communication inter link system provide the basic idea and most notably system throughput. And have characteristics somewhere in between traditional busses and NoCs; however they short of the NoC with respect to performance and complexity. The success of the NoC design depends upon the research of the interfaces between processing elements of NoC and interconnection fabric. The interconnection of a SoC established procedures has some weak points in those respects of slow bus response timing, energy limitation, and scalability problem and bandwidth limitation. In addition the interconnection has a defect that powered consumption is high on the score of connecting network system all in the communications. Moreover it is impossible to increase the number of connections in the elements infinitely by reason of the limitations of bandwidth in busses.

2.3. RELATED WORKS

The internal features and the internal peripherals are increased over the years. In the future years the cores with 1000 cores will be seen. Since the main concentration of our project is that the power dissipation was reduced in the network links. In this project we are briefly reviewing the techniques to reduce the power dissipation in the links of the network. There are so many previous techniques, they are, shielding, line-to-line spacing hike, and insertion of the repeater of the data lines. The disadvantage of the above techniques is large area and time overhead. The hardware infrastructure also becomes more expensive. The next method is that the data encoding techniques which focus on reduce the switching transitions, to reduce the power dissipation. These data encoding schemes are organized into two types. In the first type is mainly focus on reducing the power due to self-switching transition of all bus lines and evade the power indulgence due to coupling switching transition. In this group, bus invert [BI] and INC-XOR have been planned.

Alternatively, gray code, working-zone encoding, and T0-XOR have been proposed for the case of correlated data patterns.

In this category of the data encoding techniques, these are not suitable for deep sub micro meter technology nodes, where the coupling capacitance plays the pivotal role in the total inters connect capacitance. This leads the power owed to the coupling switching transition to become a huge fragment of the link power reduction. In the latter category focus on reducing power dissipated through the reduction of the coupling switching. In they presented a method based on Odd/Even Bus-Invert technique. If the number of switching transitions is half of the line width means the odd inversion is performed. In the number of transitions from 0 to 1 for two data packets is counted. The number of 1‘s in the data packet is larger than the half of the links means the inversion will be performed and the number of 1‘s is reduced to 0 transitions when the packets are transfer through the links. In the technique is used to reducing the coupling switching. From this method, the encoder counts the Type I transitions with the weighting coefficient of one and the Type II transitions with the weighting coefficient of two. If the number of 1‘s is larger than half of the links means the inversion will be performed and it reducing the power consumption on the links. The technique proposed in using the data encoding technique. This technique illustrate if the bits are encoded before they are injected into the network with the goal of minimizing the self-switching and the coupling switching in the links. These two are the main reason for the link power dissipation. Here they are classified the encoding technique into three scheme based on the four Types. In scheme 1 we are using the odd inversion and scheme 2 we are using the both odd inversion and full inversion and scheme 3 we are using the odd, full and even inversion. Based on the odd, full and even inversion the power dissipation is reduced on the Network on chip (NOC) links. In our project, we are presenting gray encoding technique, which focused on reducing the errors during the transition from transmitter to receiver and reducing the power dissipation in the links.

The main reason behind the power dissipation is the normal switching transition and the coupling switching transition. Normal switching transition is the transition of 0 and 1 in the same network line one after the other. By this way the value in the line must be change every time the new bit transfers. So, the network link may consume the more power. So, the power dissipation is high. In the same way the normal switching transition is the transition, where we consider two neighborhood lines in which the value 0 and 1 are transferred side by side. By this we have to maintain not to attract these values each other. To do this we need more power. So, the power dissipation will increase. The basic idea of this project is to send those data packets into the network after performing the encoding operation on those. By encoding operation, we can take care of the opposite values are not transferring side by side. This technique is helpful in reducing the switching activities in the links traverse by the data packets. This self-switching activity and coupling switching activity are responsible for the link power dissipation. Here we recommend the end-to-end scheme. Based on the end to end scheme we are having a better advantage. Pipeline nature of the wormhole switching technique is the main advantage. Since the same sequence of all the packets passes through all the links of the routing path. The superior method, an encoder and decoder chunks are added to the NI.

 

2.4 SCHEME 1: ENCODER DESIGN

In this technique we are more focused on minimizing the power dissipation due to self switching transaction activity of individual network bus lines and we are ignoring the power dissipation due to the coupling switching transaction activity.

In the network links, there are four types of transactions. They are Type-1, Type-2, and Type-3 and Type-4 transactions. The Type-1 transaction occurs when one of the network lines switches from 0 to 1 and the other network link not changing any more. The Type-2 transaction occurs when one of the network lines switching from 0 to 1 and the other link switching from 0 to 1. Type-3 transaction occurs when the two neighboring network links switching simultaneously. Type-4 transaction occurs when both of the network links not changing any more.

The valuable switching capacitance varies from type to type, and hence, the coupling transition activity, Tc, is a total weighted sum of diverse types of coupling transition contributions. Therefore

Tc

=

L1T1+

L2T2+

L3T3+

L4T4

Here, Tj is the average number of Type j transition and Lj is its relevant weight. We are using L1 = 1, L2 = 2, and L3 = L4 = 0. For the random input data pattern, the occurrence probability of the Type-1 and Type-2 transitions are 1/2 and 1/8 correspondingly. This causes the value of L1T1 is more than the value of L2T2 suggesting us that minimizing the number of Type I transition might cause to a significant power reduction.

 

                                              P =

T0→1Cs+ Cl+T1+2T2CcVdd2Fck

Load capacitance, Cl can be neglected. Then,

                                                   P

αT0→1Cs+

T1+2T2Cc

In scheme I, we are more concentrated on dropping the numbers of Type-1 transitions (by converting them to Types-3 and Type-4 transitions) and Type-2 transitions (by converting them to Type-1 transition). This technique will compare the present data with the  previous encoded data. By that the scheme will decide, which inversion of the current data will lead to the link power reduction. In this scheme we are doing any one of the two inversions, either odd inversion or no inversion.

Power Model:If the flit is odd inverted earlier than transmission in to the network links, the dynamic power on the link is

                    

P’α T0→1′+

K1T1’+K2T2’+K3T3’+K4T4’Cc  

 

Here,

T0→1′,

T1′,

T2′ ,

T3’and

T4’are the activities of the self transitions, and the coupling transition activity of Types 1, 2, 3, and 4, respectively. Table 1 depicts that for each transition, the difference between the flit was transmitted as it is and when the flit is transmitted after performing the odd inversion. Data are classified as follows. The first bit is the value of the generic jth line of the link, whereas the second bit represents the value of its (j + 1)th line. For each partition, the first line represent the values at time t − 1. The second line represents the values at time t. As Table I shows, if the flit is odd inverted, Types 2, 3, and IV transitions convert to Type I transitions. In the case of Type I transitions, the inversion leads to one of Types 2, 3, or Type 4 transitions. In particular, the transitions indicated as T1*, T1**, and T1*** in the table convert to Types 2, 3, and 4 transitions, respectively. Also, we have

T0→1’= T0→0 (odd) + T0→1 (even) where odd/even refers to odd/even lines. Therefore,

       P

α T0→0odd+T0→1evenCs+

K1 T2+T3+T4+ K2T1***+K3 T1*+K4 T1**Cc

Thus, if P > P’, it is convenient to odd invert the flit before transmission to reduce the link power dissipation. Noting that Cc/Cs = 4, we obtain the following odd invert condition

 

14T0→1odd

+

T1+2

T2>

14T0→0odd +

T2+

T3+T4 +

2T1***

The above equation is used to decide whether the odd inversion has to be performed or not. The terms T0→1 (odd) and T0→0(odd) are biased with a thing of 1/4, for link widths more than 16 bits.

T1

+2

T2>

T2+T3+T4+2T1***

 

The use of the nearest odd invert circumstance dips the efficiency of the encoding scheme owed to the error induces by the estimate but it simplify the hardware accomplishment of encoder. Now, defining

Tx=T3

+

T4+

T1***

 

Ty=T2

+

T1

T1***

Ty>Tx

Assuming the link width of w bits, the total transition between adjacent lines is w − 1, and hence

Ty+Tx

=w-1

Ty

>

w-12

This depicts the condition used to determine whether the odd inversion has to be performed or not.

Existing Encoding Architecture:

We assumed the network link size of w bits. If the encoding was not performed, then the body flits are grouped in w bits by the Network Interface and are sent over the link. In the proposed method we are using a single bit of the link as the inversion bit. This inversion bit will indicate the data travelling over the link has been inverted or not. More importantly, the Network Interface packs the body of the data flits in w − 1 bits. The encoding logic E, which is incorporated into the Network Interface, is answerable for deciding if the inversion should take place and if there is a need, then performing the inversion if needed. In order to take the decision on inversion, we have to perform the comparison between the previously encoded bits and the current flit being transmitted. In the next, whose w bits are the combination of w − 1 payload bits and a “0” bit, represent the first input of the encoder, while the previous encoded flit represents the second input of the encoder.

The w − 1 bits of the incoming body flit are represented by Xi , i = 0, 1, . . . ,w − 2. The w − 1 bits of the previous encoded body flit are represented by Yi , i = 0, 1, . . . ,w − 2. The last bit w is used to indicate by the term “inv”. This bit will show that the data was inverted or not. If the inversion takes place, this bit becomes high (inv=1). If the inversion does not takes place, this bit becomes low (inv=0). In the logic of encoder, we will take Ty block into the consideration, whose function is that if any transition types of Ty are detected, this bit becomes high. Then the meaning of inversion bit becomes high, the inversion leads to the reduction of the link power in network links. The Ty block may be designed by using a simple circuit. The next stage of the encoder, which is a majority voter block, determines if the condition is satisfied. If this condition is satisfied, in the last stage, the inversion is performing on odd bits. The decoder circuit simply inverts the received flit when the inversion bit is high.

Fig.2.2: Encoder Architecture for Scheme 1

In scheme I, we put our focus on reducing the number of Type-1 transitions and Type-2 transitions. The scheme compares the current data with the previous one to decide whether odd inversions or no inversions of the current dating can lead to the link power reduction.

In Scheme II, both Type-1 and 2 transitions are taken into account for deciding between half and full invert, which is depending up on the amount of switching reduction.

 

CHAPTER 3

PROPOSED SYSTEM

3.1. INTRODUCTION

Mainly, we are focused on techniques that are intended to minimize the dissipation of power by the links of the network on chip and also by routers and network interfacings (NIs) and their subscription are expected to increase as per the current technologies. In particular, we were project a set of data encoding schemes operating at flit levels and on an end-to-end base system, which can allows us to minimizing together the switching activity and the coupling switched activity. The proposed encoding schemes are presented and discussed at both the algorithmic level and assess by means of simulation on synthetic and real traffic scenarios. The investigation consider into account several aspects and metrics of the system design in the communication subsystem, as well as gate area, power dissipation and utilization of energy. The results shown that by using the proposed encoding schemes in the communication system the power and energy can be saved without any momentous degradation in performance and with area overhead in the NI.

In our project we are conferring the projected encoding technique whose purpose is to curtail the dissipation of the power by dropping the coupling transition behavior on the links of the interconnection network. Let us first illustrate that the power model that consists of the diversity of components of power dissipation of a link. The dynamic power dissolute by the interconnect and drivers is

P =

T0→1Cs+ Cl+Tc CcVdd2Fck

Where

The number of 0 to 1 transitions in the bus in two consecutive transmissions is T01

The number of correlated switching in between physically neighboring lines is Tc.

The line to substrate capacitance, is Cs

The load capacitance, is Cl

The coupling capacitance, is Cc

The supply voltage is Vdd

Frequency of the clock signal is Fck

We can organize the coupling transitions into four types. Those are type 1, type 2, and type 3, type 4 transitions. Type 1 transition occurs only when one of the network lines switches from zero to one or one to zero, while remaining lines unchanged. Type 2 transition occurs when one line changes from one to zero while other from zero to one. Type 3 transition occurs when both lines are switched concurrently. Type-4 transaction occurs when both of the network links not changing any more.

The valuable switching capacitance varies from type to type, and hence, the coupling transition activity, Tc, is a total weighted sum of diverse types of coupling transition contributions. Therefore

Tc=

L1T1+

L2T2+

L3T3+

L4T4

Here, Tj is the average number of Type j transition and Lj is its relevant weight. We are using L1 = 1, L2 = 2, and L3 = L4 = 0. For the random input data pattern, the occurrence probability of the Type-1 and Type-2 transitions are 1/2 and 1/8 correspondingly. This causes the value of L1T1 is more than the value of L2T2 suggesting us that minimizing the number of Type I transition might cause to a significant power reduction.

P =

T0→1Cs+ Cl+T1+2T2CcVdd2Fck

Cl can be neglected.

P

αT0→1Cs+

T1+2T2Cc

Here, we evaluate the happening probability for various types of transitions. Consider that flit (t − 1) and flit (t) refer to the previous flit which was transferred via the link and the flit which is about to pass through the link, respectively. We consider only two adjacent bits of the physical channel. Sixteen diverse combinations of these four bits could take place. Note that the first bit is the value of the generic ith line of the link, whereas the second bit represents the value of its (i +1)th line. The number of transitions for Types 1, 2, 3, and 4 are 8, 2, 2, and 4, correspondingly. For a random set of data, each of these sixteen transitions has the same probability. Therefore, the occurrence probability for Types 1, 2, 3, and 4 are 1/2, 1/8, 1/8, and 1/4, in that order.

Energy consumption and power dissipation are today recognized as the most important design optimization objectives. In this project, we project encoding and decoding schemes. In Scheme 3, we consider the fact that Type-1 transitions demonstrate different behaviors in the case of odd and even inverting and make the inversions which causes to the more saving of the power.

The fundamental thought of the anticipated approaching is encoding the flits before they are injected into the network with the ambition of minimize the self-switching activity and the coupling switching activity in the links by the flits. In fact, self-switching activity and coupling switching activity and cross switching activity are to blame for link power dissipation.

Our plan is to switch Type-1 and Type-2 transitions into Type-3 and Type-4 bit combinations as far as possible. This is because, Type-3 and Type-4 combinations consequence in less coupling switching and normal switching transitions.

Wormhole flow control, also called wormhole switching or wormhole routing is a system of simple flow control in computer networking based on known fixed links. Where it is a subset of flow control method called Flit-Buffer Flow Control.

Table 3.1: Impact of odd inversion based on transform of types of transitions

In this project, we project encoding and decoding schemes. In Scheme 3, we consider the fact that Type-1 transitions demonstrate different behaviors in the case of odd and even inverting and make the inversions which causes to the more saving of the power.

The basic proposal of the proposed method is to encode the data lines before they are inject into the network links with the aim of minimizing the self-switching transition and the coupling switching transition in the links traversed by the flits. Of course, self-switched activity and coupling switched activity are accountable for link power dissipations.

Modules which are implemented in our methodology are as follows:

  1. Module-C
  2. Module-A
  3. T4_2 module
  4. Decode for Scheme-2
  5.  Scheme-3 Decoder
  6. Ones_count

 SCHEME 2: ENCODER AND DECODER DESIGN

In this encoder scheme 2 we are making use of both the odd invert and full invert as well. The full inversion operation can convert the Type-2 transitions into the Type-4 transitions. This scheme compares the present data lines with the previously encoded data lines before taking a decision that to perform odd inversion or full inversion or no inversion to be performed.

SCHEME2 POWER MODEL:

Let us designate P,

P’, and

P”be the power dissolute by the data link when the flit is transmitted with no inversion, odd inversion, and full inversion, in that order. The odd inversion results in power reduction when P’< P’’ and P’< P. The power P’’ is given by

P”α T1

+

2T4***           

By negating self-switching activity, we can get the condition P’< P’’ as

T2

+

T3+ T4+

2T1*** <

T1+

2T4**     

2

T2-T4**<2

Ty

ω+1  

The odd inversion condition is

2T2-T4**

<2

Ty

ω+1Ty>

ω-12 

T2> T4**

    

Therefore, we can get full inversion condition as

2

T2-T4**>2

Ty

ω+1

T2

>

T4**  

Proposed Encoding Architecture:

The operating ideology of this encoder is analogous to those of the encoder implement Scheme I. This encoding architecture, which is dependent on the odd invert and the full invert condition, is shown in Fig. 2.3. The last bit w is used to indicate by the term “inv”. This bit will show that the data was inverted or not. If the inversion takes place, this bit becomes high (inv=1). If the inversion does not takes place, this bit becomes low (inv=0). In this encoder, besides to the Ty block in the Scheme I encoder, we used the T2 and T ∗∗ 4 blocks as well. These blocks will tell you if any inversion based on the transition types T2 and T ∗∗ 4 should be taken place or not. These are very useful for the link power reduction. The next step in the encoder is formed by a group of 1s blocks, by which we can calculate the number of 1s in their inputs. The output of these blocks has the width of log2 w. The result of the first (top most) 1s block tells you the number of transitions that odd inverting of pair bits results in the link power decrease. The second 1s block determines the number of transitions whose full inverting of pair bits results in the link power reduction. At last, the underneath 1s block determines how many of transitions whose full inverting of pair bits results in the amplified link power.

Fig.2.3: Encoder Architecture For Scheme 2

Based on the number of 1’s for each transition type, Module A decides if an odd inverting or full inverting action should be performed for the power reductions.

In our project, we design the internal modules of the scheme 2 encoder like module-A and T2, Ty, T4** and ones counter. The design diagrams of these modules are shown below.

K: my projectmyproject circuit diagrams	y_out.BMP

Fig.2.4: Ty module

K: my projectmyproject circuit diagrams	2_module.BMP

Fig.2.5: T2 Module

K: my projectmyproject circuit diagrams	4_2_module.BMP

Fig.2.6: T4** Module

K: my projectONES COUNT.png

Fig.2.7: Ones Module

K: my projectmodule_A.png

Fig.2.8: Module-A

This module will perform the odd, even, full or no inversion based on the outputs “10,”“01,” “11,” or “00,” respectively. The outputs “01,” “11,” and “10” show that whether respectively, are satisfied. In this project, Module C was designed based on the conditions given.

The output of the encoder is given as input to the decoder. Mainly the designing of encoder and decoder is to reduce normal switching behavior and the coupling switching behavior in the links of the network. The operation performed in the decoder is to determine which action has been taken in the encoder. If two inversion bits were used, the overhead of the decoder hardware could be substantially compact.

The internal diagram of the scheme 2 decoder was as shown in the below figure

K: my projectdecoder_s_2.BMPFig.2.9: Decoder of Scheme II

                          

 

 

 

3.2. SCHEME 3: ENCODER AND DECODER DESIGN

In the proposed encoding Scheme 3, along with the scheme II we are performing the even inversion also. The logic behind inclusion of the even inversion is that it can convert some of Type-1(T∗∗∗1) transitions to Type-2 transitions that can be observed from Table given below. If the flit is even inverted, the transitions indicated as T∗∗1/ T∗∗∗1 in the table are converting to Type-4/Type-3 transitions. Therefore, there may be a chance to reduce the link power by the even inversion as well. It compares the present data lines with the previously encoded data lines before performing the odd inversion or full inversion or even inversion or no inversion.

Table 3.2: Impact of even inversion on transition types changes

 

 

 

3.2.1. LOGIC DIAGRAM

The encoder architecture of the scheme 3 was as shown in the below figure

Fig.3.2.1: Encoder architecture of the scheme3

The working is similar to encoder implement Scheme II. This encoding architecture, which is dependent on the odd invert and the full invert condition, and even inversion as well, is shown in Fig.3.2.1.

The last bit w is represented by the term “inv”. This bit will show that the data was inverted or not. If the inversion takes place, this bit becomes high (inv=1). If the inversion does not takes place, this bit becomes low (inv=0). The first stage of the encoder determines the transition types while the second stage is obtained by a group of 1s blocks which count the number of ones in their inputs. In the first stage, we have added the Te blocks which determine if any of the transition types of T2, T **1 , and T ***1 is detected for each pair of their input bits. For these transition types, the even invert action yields link power reduction. Again, we have four Ones blocks to show the number of detected transitions for  Ty, Te, T2, T **4, blocks.

The output of the top 1s block determines the number of transitioning that odd inverting of pair bits leads to the link powered reduction. The output of the second 1s block determines the number of transitioning that even inverting of pair bits leads to the link powered reduction. The third 1’s block identifies the number of transitions whose full inverting of pair bits leads to the link powered reduction. Finally, the bottom 1’s block specifies the number of transitions whose full inverting of pair bits leads to the increasing link powered.

In scheme 3 we are using even invert in Te blocks, because in event invert the switching activity is more. So to reduce the switching activity, we are using even invert in Te system.

The results of the Ones blocks are applied as inputs for Module C. This module decides if odd, even, full, or no invert action respective to the outputs “10”, “01”,“11” or “00,” respectively should be performed. Depends upon the number of 1’s for every transition type, Module C determines if an odd or full or even invert action should be performed for the power reduction.

In this project, we design our design for the internal modules of the scheme 3 encoder like module-C and T2, Ty, T4** and ones counter. The design diagrams of these modules are shown below.

Power Model:Let us assume with P, P, P’’, P’’’ arethepower dissolute by the data link when the data bits were sent withno inversion, odd inversion, full inversion, and even inversion, correspondingly. Same as the explanation given for Scheme I,we can assess the condition P’’’< P as

 

T1+2T2

>

T2+

T3+

T4+

2T1*       

Defining

Te

=

T2+T1-T1*      

We will get the condition P’’’<P

Te

>

ω-12

We can get the condition P’’’<P’

T2+T3+T4+2T1*

<

T2+T3+T4+2T1***

                                                                                Te >Ty.           

We can get the condition P’’’<P’’

T2+T3+T4+2T1*

<

T1+2T4**

Tr

=

T3+T4+T1*

Te

=

T2+T1-T1*

The link width was assumed of w bits. So, the total number of transitions between neighboring lines is w-1                                                                                                                                                

Te+Tr= ω-1

2

T2-T4**<

2Te-

ω+1

The power reduction due to even inversion when P’’’<P, P’’’<P’, P’’’<P’’.

Te

>

ω-12  ,

Te >Ty,

 2

T2-T4**<

2Te-

ω+1

The power reduction due to full inversion when P’’<P, P’’<P’’’, P’’<P’

2

T2-T4**>2

Ty-

ω+1  ,

T2>

T4**,

2

T2-T4**<

2Te-

ω+1

In the same way, the power reduction due to odd inversion when P’<P, P’<P’’’, P’<P’’

2

T2-T4**<2

Ty-

ω+1  ,

Ty>

w-12

Te <Ty.         

Proposed Encoding Architecture:

 

The functionality of the encoder is analogous to those of the encoders implement in the Schemes I and II. The proposed encoding architecture, which is based on the even invert condition, the full invert condition, and the odd invert condition, is shown in Fig. 3.2.1.

The last bit w is used to indicate by the term “inv”. This bit will show that the data was inverted or not. If the inversion takes place, this bit becomes high (inv=1). If the inversion does not takes place, this bit becomes low (inv=0). In the initial stage, we have incorporated the Te blocks which establish if any of the transition types of T2, T **1 , and T***1 is detect for every pair of  bits of their inputs. For these transition types, the even invert action yields link power reduction. Again, we have four Ones blocks to determine the number of detected transitions for each Ty, Te, T2, T **4 blocks.Similar to the procedure used to design the decoder for scheme II, the decoder for scheme III may be designed.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3.3. ENCODER MODULES

3.3.1. TY MODULE

K: my projectmyproject circuit diagrams	y_out.BMP

Fig.3.3.1: Ty Module

  This is the internal diagram of Ty transistor. In this transistor we are using AND, NOT and OR gates to get the odd inversion value. We are taking four inputs X0, X1, Y0, Y1. The inputs are given to the AND gate. Here the AND gate will perform the AND operation and the output values of the AND gate will given to the input of the OR gate. OR gate will perform the OR operation and the output value of the OR gate gives the Ty value. In this way we are taking different Ty transistors.

 

 

 

 

 

 

3.3.2. T2 MODULE

K: my projectmyproject circuit diagrams	2_module.BMP

Fig.3.3.2: T2 Module

This is the internal diagram of T2 module. In this module we are using AND, NOT and OR gates to get the full inversion value. We are taking four inputs X0, X1, Y0, Y1. This inputs are given to the AND gate. Here the AND gate will perform the AND operation and the output values of the AND gate will given to the input of the OR gate. OR gate will perform the OR operation and the output value of the OR gate gives the T2 output. In this way we are taking different types T2 modules.

 

 

 

 

 

 

 

3.3.3. T4 MODULE

K: my projectmyproject circuit diagrams	4_2_module.BMP

Fig.3.3.3: T4** Module

This is the internal diagram of T4 module. In this module we are using AND, NOT and OR gates to get the full inversion value. We are taking four inputs X0, X1, Y0, Y1. This inputs are given to the AND gate. Here the AND gate will perform the AND operation and the output values of the AND gate will given to the input of the OR gate. OR gate will perform the OR operation and the output value of the OR gate gives the T4 output. In this way we are taking different types T4 modules.

 

3.3.4. Te MODULE

K: my projectmyproject circuit diagrams	e_module.BMP

Fig.3.3.4: Te Module

This is the internal diagram of Te module. In this module we are using AND, NOT and OR gates to get the full inversion value. We are taking four inputs X0, X1, Y0, Y1. This inputs are given to the AND gate. Here the AND gate will perform the AND operation and the output values of the AND gate will given to the input of the OR gate. OR gate will perform the OR operation and the output value of the OR gate gives the Te output. In this way we are taking different types Te modules.

The output of the encoder is given as input to the decoder. Mainly the designing of encoder and decoder is to reduce normal switching behavior and the coupling switching behavior in the links of the network. The operation performed in the decoder is to determine which action has been taken in the encoder. If two inversion bits were used, the overhead of the decoder hardware could be substantially compact.

The internal diagram of the scheme 3 decoder was as shown in the below figure.

                                 Fig.3.3.5: Decoder of Scheme III

A. OR GATE

The OR gate is a digital logic gate that implements logical disjunction – it behaves according to the truth table to the right. A HIGH output (1) results if one or both the inputs to the gate are HIGH (1). If neither input is high, a LOW output (0) results. In another sense, the function of OR effectively finds the maximumbetween two binary digits, just as the complementary AND function finds the minimum.

In put Out put
A B AOR B
0 0      0
0 1      1
      1 0      1
       1 1      1

Table 3.3.1: OR Gate

There are two symbols of OR gates: the American (ANSI or ‘military’) symbol and the IEC (‘European’ or ‘rectangular’) symbol, as well as the deprecated DIN symbol. For more information see Logic Gate Symbols.

OR ANSI Labelled.svg IEC OR.svg OR DIN.svg
MIL/ANSI Symbol      IEC Symbol DIN Symbol

Fig.3.3.5: OR Gate

 

B. AND GATE

The AND gate is a basic digital logic gate that implements logical conjunction – it behaves according to the truth table to the right. A HIGH output (1) results only if both the inputs to the AND gate are HIGH (1). If neither or only one input to the AND gate is HIGH, a LOW output results. In another sense, the function of AND effectively finds the minimum between two binary digits, just as the OR function finds the maximum. Therefore, the output is always 0 except when all the inputs are 1s.

In put Out put
A B A AND  B
0 0      0
0 1      0
      1 0      0
       1 1      1

 

 

 

 

 

                                             Table 3.3.2: AND Gate

There are three symbols for AND gates: the American (ANSI or ‘military’) symbol and the IEC (‘European’ or ‘rectangular’) symbol, as well as the deprecated DIN symbol. For more information see Logic Gate Symbols.

AND ANSI.svg AND IEC.svg AND DIN.svg
MIL/ANSI Symbol   IEC Symbol DIN Symbol

Fig.3.3.6: AND Gate

The AND gate with inputs A and B and output C implements the logical expression C = A cdot B

 

C. NOT GATE

NOT gate (also often called Inverter) is a logic gate. It takes one input signal. In logic, there are usually two states, 0 and 1. The gate therefore sends 1 as output, if it receives 0 as input. Alternatively it received 1 as input, and sends 0 as output.

Generally, below 0.5V is 0, and 4–5V is 1.

The inverter can be made of a discrete transistor with other components, or several inverters may be packaged in an integrated circuit.

INPUT

A

OUTPUT

NOT A

0 1
1 0

      Table 3.3.3: NOT Gate

There are three symbols for the NOT gate:

NOT ANSI Labelled.svg NOT IEC.svg NOT DIN.svg
MIL/ANSI Symbol IEC Symbol DIN Symbol

Fig.3.3.7: NOT Gate

The “bubble” (o) present at the end of the NOT gate symbol above denotes a signal inversion (complementation) of the output signal. But this bubble can also be present at the gates input to indicate an active-LOW input. This inversion of the input signal is not restricted to the NOT gate only but can be used on any digital circuit or gate as shown with the operation of inversion being exactly the same whether on the input or   output terminal. The easiest way is to think of the bobble as simply an inverter.

Fig.3.3.8 Complimentation Gate

 

 

 

 

 

 

3.4. ONE MODULE

K: my projectONES COUNT.png

Fig.3.4.1: Ones Module

The internal diagram of One Module, we are using ADDER operation. Here the ADDER will perform the full adder operation. The output values of the Ty, T2, T4** transistors to given the input values of the One Module. Here the One Module will perform the Full Adder operation and gives the number of the ones in the output. This output will gives to the input of Module C. Here we are using full adder.

RIPPLE-CARRY ADDER

220px-4-bit_ripple_carry_adder

Fig.3.4.1 Ripple-Carry Adder

4-bit adder with logic gates shown

It is possible to create a logical circuit using multiple full adders to add N-bit numbers. Each full adder inputs a Cin, which is the Cout of the previous adder. This kind of adder is called a ripple-carry adder, since each carry bit “ripples” to the next full adder. Note that the first (and only the first) full adhering may be replacing by a half adder (under the assumption that Cin = 0).

The layout of a ripple-carry adder added is simple, which allows for fast design time; however, the ripple-carry adder is relatively slow, since each full adder must wait for the carry bit to be calculated from the previous full adder. The gate delay can easily be calculating by inspection of the full adder circuit. Each full adder requires three levels of logic. In a 32-bit, there are 32 full adders, so the critical path (worst case) delay is 2 (from input to carry in first adder) + 31 * 3 (for carry propagation in later adders) = 95 gate delays. The general equation for the worst-case delaying for a n-bit adders is

T_{CRA}(n) = T_{HA} + (n-1) cdot T_c + T_s = T_{FA} + (n-1) cdot T_c = 6 D + (n-1) cdot 2 D = (n+2) cdot 2 D

The delay from bit position 0 to the carry-out is a little different:

T_{CRA_{[0:c_{out}]}} = T_{HA} + n cdot T_c = 3 D + n cdot 2 D

The carry-in must travel through n carry-generator blocks to have an effect on the carry-out.

T_{CRA_{[c_0:c_n]}}(n) = n cdot T_c = n cdot 2 D

A design with alternating carry polarities and optimized AND-OR-Invert gates can be about twice as fast.

220px-4-bit_carry_lookahead_adder

Fig.3.4.2: Carry Look Ahead

4-bit adder with carry lookahead

To reduce the computation time, engineers devised faster ways to add two binary numbers by using carry-lookahead adders. They work by creating two signals (P and G) for each bit positioning, based on whether a carry is propagated through from a less significant bit positioned (at least one input is a ‘1’), generated in that bit position (both inputs are ‘1’), or killed in that bit position (both inputs are ‘0’). In most cases, P is simply the sum output of a half addering and G is the carry output of the same adder. After P and G are generated the carries for every bit position are created. Some advanced carry-lookahead architectures are the Manchester carry chainBrent–Kung adder, and the Kogge–Stone adder.

Some other multi-bit adder architectures breaking the adder into blocks. It is possible to vary the length of these blocks based on the propagation delay of the circuits to optimizing computation time. These block based adders include the carry-skip (or carry-bypass) adder which will determine P and G values for each block rather than each bit, and the carry select adder which generates the sum and carry values for either possiblying carry inputting (0 or 1) to the block, using multiplexers to select the appropriate result whenthe carry bit is known.

Other adder designs include the carry-select adderconditional sum addercarry-skip adder, and carry-complete adder.

FULL ADDER

The full-adder circuit adds three one-bit binary numbers system and outputs two one-bit binary numbers, a sum (S) and a carry (C1). The full-adder is usually a componential in a cascade of adders, which add 8, 16, 32, etc. binary numbers.  The carry input system for the full-adder circuit is from the carry output system from the circuit “above” itself in the cascade.  The carry output system from the full adder is fed to another full adder “below” itself in the cascade.

If you look closely, we will see the full adder is simple two half adders joining by an OR.

3.5. MODULE C

This module determines if odd, even, full, or no invert action corresponding to the outputs “10,” ,“01,” “11,” or “00,” respectively should be performed. system, Module C decides if an odd inverting or full inverting or even inverting action should be performed for the powered reduction.

K: my projectmodule_C.pngFig.3.5.1: Module-C

In the internal diagram of the Module C we are using Multiplexer, Subtracters, Right shift, left shift, AND gate, and MUX. The output of the One Module will give the input of the Module C. Here the different types of blocks will perform their own operation to get the output. The Multiplexer will perform multiplication operation, the subtracter will perform the subtraction operation, the Right shift and Left shift will perform the shifting operation, the AND gate will perform AND operation, and the MUX will perform the selection of the lines and data collection.

This module determines if the odd, even, full, or no invert action corresponding to the outputs “10,”“01,” “11,” or “00,” respectively, should be performed. The outputs “01,” “11,” and “10” show that whether respectively, are satisfied. In this project, Module C was designed based on the conditions given. Finally we get the output of the encoder scheme.

  1. MULTIPLEXERS

Multiplexer is a special type of combinational circuits. There inputs systems and routes it to the output system. The selection of the system in the input process.. It is generally an active low that means it will perform the required operation when it is low.

Fig.3.5.2: Multiplexer

Multiplexers come in multiple variations systems.

2 : 1 multiplexer

4 : 1 multiplexer

8:1  multiplexer

16 : 1 multiplexer

32 : 1 multiplexer

  1. HALF SUBTRACTORS

Half subtractor is a combination circuit with two inputs and two outputs (difference and borrow). It produces the difference between the two binary bits at the input and also produces an output (Borrow) different proceeds editable energy to indicate if a 1 has been borrowed. In the subtraction (A-B), A is called as Minuend bit and B is called as Subtrahend bit.

  1. LEFT SHIFT

These instructions shift the operand word or bye bit by bit to the left hand , right hand signal cooperation execution and insert zeros in the newly introduced least significant bits. In case of the entire shift and register or a memory location but cannot be an immediate data. All flags are affected depending upon the result. It is to be noted here that the shift operation is through carry flag.

BIT POSITION  cf 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

OPERAND         1   0   1  0   1    1 0  0 1 0 1 0 0 1 0 1

SHL RESULT 1ST  1   0   1   0   1   1    0 0 1 0 1 0 0 1 0 1 0

SHL RESULT 2ND  0    1   0   1   1   0    0 1 0 1 0 0 1 0 1 0 0

  1. RIGHT SHIFT

This instruction perform bit wise right shift on the operand word or byte that may reside in a register developed project could be become systematic processes or a memory location, by the specified count in the instruction and insert zeros in the shifted positions. The result is stored in the destination operand. The instruction shifts the operand through the carry flag.

BIT POSITION 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 CF

OPERAND  1   0    1   0   1  1   0 0 1 0 1 0 0 1 0 1

COUNT = 1  0   1    0    1   0 1   1 0 0 1 0 1 0 0 1 0   1

COUNT = 2  0    0   1    0   1  0  1 1 0 0 1 0 1 0 0 1   0

3.6. DECODER

The output of the encoder is given as input to the decoder. Mainly the designing of this encoder and decoder is to reduce the normal switching activity and the coupling switching activity in the links of the network on chip. The operation performed in the decoder is to determine which action has been taken in the encoder. If two inversion bits were used, the overhead of the decoder hardware could be substantially reduced.  The internal diagram of the scheme 3 decoder was as shown in the below figure.

K: my projectdecoder_s_2.BMP

Fig.3.6.1: Decoder of Scheme 3

In decoder, we are taking the output value of the encoder to the input of decoder. In decoder it will perform the decoding operation to reduce the normal switching activity and the coupling switching activity in the links of the network on chip.

A decoder is a combinational circuit. It has n input and to a maximum m = 2n outputs. Decoder is identical to a demultiplexer and multiple formations without any data input. It performs operations which are exactly opposite to those of an encoder.

Fig.3.6.2: Decoder

A binary code of n bits is up to 2^n distinct information. A decoder is a combinational circuit that converts binary information from n input lines to a maximum of input lines.

If the n-bit coded information has unused combinations, the decoder may have fewer than outputs.

If the n-bit coded information has unused combinations, the decoder may have fewer than 2^n outputs.

Fig.3.6.3: Decoding Of Address Lines

The decoder above can function as a one-to-four-line demultiplexer when E is taken as a data input line and A and B are taken as the selection inputs.

The single input variable E has a path to all four outputs, but the input information is directed to only one of the output lines, as specified by the binary combination of the two selection lines A and B.

This feature can be verified from the truth table of the circuit. For example, if the selection lines A B = 1 0, output D2 will be the same as the input value E, while all other outputs are maintained at 1. Because decoder and demultiplexer operations are obtained from the same circuit, a decoder with an enable input is referred to as a decoder–demultiplexer.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CHAPTER 4

INTRODUCTION TO VLSI

4.1. INTRODUCTION TO VLSI

The abbreviation for VLSI is “Very Large Scale Integrated Circuits”. It is a organization of ICs. An IC of general VLSI include about millions active devices. Memories, computers, and signal processors, etc are the general functions of the VLSI. By using the semiconductor Backend process technology, we can manufacture the working circuits as per the design specifications and requirements. There are many designs that can create different environment or styles in communication systems. In integrated circuit design, for the making of working chip, the conducting and non-conducting materials are layered on top of each other, the same was specified in the design specifications also.

When a chip is subjected to use for only a specific use, it is known an application-specific integrated circuit (ASIC). In addition, PC design aggregates that of the electronic activity into standard IC packages, the position and interconnection of which are essential to the final circuit.

Printed circuitry may integrated circuitry is, but it is lower, less compact, high expensive and unable to take advantage of in the process continued. The design of these electronic circuits can be achieved at many different refinement levels from the most detailed layout to the most abstract architectures. Given the computers are increasingly used to aid this design at each step.

Thus the term computer-aided design description of this modern way and seems more thinking powered in the operation broad in its scope than the computer-aided engineering (CAE).

4.2. APPLICATION AREAS OF VLSI

PLAs

Combinational circuit elements are an most important part of any digital design. Three logical methods of are random logic, read-only memory (ROM), logic array (PLA).

A PLA can also be used as a part of its outputs to the inputs and clocking on device of the electronic formed both sides. Normally, for high-speed applications, the PLA is not implemented as two NOR arrays.

GATE-ARRAYS

The gate-array is a popular technique used to design IC chips. Like the PLA over of the matter to yield the final circuit. Gate-arrays of the so the interconnection options are more flexible. Gate-arrays exist with using many names, eg: uncommitted logic arrays and master-slice. The disadvantage of gate-arrays is that may be process they are not optimal for any task.

GATE MATRICES

The gate matrix is the next step in the evolution of automatically generated layout from high-level specification. Like all regular process forms of layout, this and its customizable system aspects. In gate matrix layout the fixed design consists of vertical columns of polysilicon gating material. The customizable part to interconnect and pattern eligible form gates with the columns.

4.3. APPLICATIONS OF VLSI

Electronic systems now perform a wide variety of tasks in daily life. Electronic systems in some cases have that operated mechanically, hydraulically, or by other means; electronics are usually smaller, more flexible, and easier to service. In other cases electronic systems have created totally new applications. Personal systems such as portable with formed in the remarkably little energy.

  • Electronic systems in cars Sysco systems and displays; a suspensions to varying terrain, and perform the control functions required for anti-lock braking (ABS) systems.
  • Digital electronics c and video, even at the high frequency on-the-fly in consumer electronics.
  • Low-cost terminals for require formed so as be sophisticated electronics, despite their dedicated function.
  • Personal computers and provide word-processing, financial analysis, and games. Computers (CPUs) and special-purpose hardware for disk access, faster screen display, etc.

 

4.4. ADVANTAGES OF VLSI

While we will concentrate on integrated circuits in this book, the properties of integrated and cannot efficiently circuit—largely. Integrated circuits improve in several critical ways. ICs have three key advantages over digital circuits built from discrete components:

  • SIZE: Integrated circuits are the in much smaller—both transistors and wires are shrunk to micrometer sizes, those of the network process in the developing matter of discrete components. Small size leads and power consumption, since smaller in the have mechanism society smaller parasitic resistances, capacitances, and inductances.
  • SPEED: Signals can be switched between logic 0 and logic 1 much quicker within a chip than they can between chips. Within a chip can occur between chips on a printed in the board of the circuit board. The high speed of circuit on-chip is due to their small size in the operation of the sub system, smaller parasitic capacitances to slow down the signal.
  • POWER CONSUMPTION: Logic within a chip also takes much less power. Once again, owner powered device system proposal communicated  on the chip smaller parasitic capacitances and resistances require less power to drive them.

4.5. VLSI AND SYSTEMS

These advantages of integrated circuits over the system are in on the translate into advantages of the physical component at the system level:

  • SMALLER PHYSICAL SIZE: Smallness is often an advantage in itself—considers owner to be communicated assess handheld cellular telephones.
  • LOWER POWER CONSUMPTION: Replacing a handful of chip reduces total powered mentioned consumption. Reducing power consumption has a on the rest of the system: a of the circut smaller, cheaper power supply can be used; since outcome ripple effect means less heat, a fan may no longer be necessary; a simpler special feature shielding may be feasible, too.
  • REDUCED COST: Reduced the number of components, will inevitably reduce system cost the power supply requirements society, cabinet costs, and so on,. The ripple effect of integration system built from custom ICs can be less, ICs cost more than the standard parts they replace. why integrated circuit technology is such that the cost of a has such profound design of digital systems requires understanding both the technology of and the economics influence on the of ICs and digital systems.

4.5.1. INTRODUCTION TO ASICS AND PROGRAMMABLE LOGIC

The last 15 years have witnessed the demise in the number of cell-based ASIC designs as a means for developing customized SoCs. Rising NREs, the use of cell-based ASICs to the highest volume applications development times and risk have mostly restricted; applications that can withstand the multi-million dollar development costs associated with 1-2 design re-spins. Analysts estimate that the number of cell based ASIC design starts per year is now only between 2000-3000 compared to ~10,000 in the late 1990s.

The FPGA has emerged as a technology that fills some of the gap left by cell-based ASICs. Yet even after 20+ years of existence and 40X more design starts per year than cell-based ASICs, a fraction that of cell-based ASICs. FPGA designs that never make it into production and that for the most part, the FPGA is still seen by many as a vehicle for prototyping or college education and has perhaps even succeeded in actually stifling industry innovation.

Structured ASIC that is tipped to reenergize the path to innovation within the electronics industry. FPGA technology advantages of (i.e. fast turnaround, no mask charges, no minimum order quantity) and of cell-based ASIC (i.e. low unit cost and power) to deliver a new platform for SoC design. This document defines requirements for development of Application Specific Integrated Circuits (ASICs). It is intended to be used as an appendix to a Statement of Work.

The document complements the ESA ASIC Design and Assurance Requirements (AD1), which is a precursor to a future ESA PSS document on ASIC design.

 

4.6. SOFTWARE TOOLS

The software tools used in the current design are:

  • Xilinx ISE 12.3
  • ModelSim SE 6.4c

 

4.6.1 XILINX ISE 12.3

The Xilinx Integrated Synthesis Environment is a tool from the Xilinx Corporation. This tool was used in the project for editing the VHDL code to its synthesis .The Xilinx ISE tools allow you to use schematics, hardware description languages (HDLs), and specially designed modules in a number of ways. Xilinx Tools is a software tool used for the design of digital circuits implemented using Xilinx Field FPGA or CPLD.

Digital designs can be entered in different ways using above CAD tools. In this, the design flow that involves the use of Verilog HDL only is used. The CAD tools are used for the design combinational and sequential circuits starting with Verilog HDL design specifications. This tool synthesizes high level language to RTL level. By using the device utilization can be determined.

Digital designs can be entered in various ways using the above CAD tools: using a schematic entry tool, using a hardware description language (HDL) i.e.Verilog or VHDL or a combination of both. In this we will only use the design flow that involves the use of Verilog HDL. The CAD tools enable you to design combinational and sequential circuits starting with Verilog HDL design specifications. This tool synthesizes high level language to RTL level. By using this we can determine the device utilization.

DESIGN ENTRY

This is the initial step. Here the source files based on design objectives can be created. The file using a HDL, such as VHDL, Verilog, or ABEL can be created here. Multiple formats for the lower-level source files in this design can be used. While working with a synthesized EDIF or NGC/NGO file, you can skip design entry and synthesis and start with the implementation process.

 

SYNTHESIS

Here high level i.e. descriptive languages are synthesized and produced net list files or mixed language designs acts as net list files which are allowed to given as input to the implementation step.

IMPLEMENTATION

Here the logical design is converted into physical file format. This process the varied with the selection of the target i.e., either FPGA or CPLD.

VERIFICATION

For the verification of functionally and timing of this design, simulator can be used. Simulation is an easy way to verify complex functions in a small interval of time.

DEVICE INSTALLATION

After generation of programming file, initially programming file is generated and then device configuration is performed. During configuration, configuration files are generated and then host computer downloads the programming files and then given it to the Xilinx device.

VERILOG HDL

Verilog was started within the year 1984 by entry style Automation opposition as a proprietary hardware modeling language. It’s reported that the first language was designed by taking options from the foremost well-liked HDL language of the time, known as town, yet as from ancient PC languages. Some of the characteristics of Verilog language are listed below.

  • Verilog HDL could be a Hardware Description Language (HDL).
  • It’s wont to describe a digital system.
  • Associate degree HDL may describe the layout of the wires, resistors associate degree transistors on a microcircuit (IC) chip, i.e., the switch level.
  • It would describe the logical gates and flip flops in an exceedingly digital system.
  • A good higher level describes the registers and therefore the transfers of vectors of knowledge between registers. This is often known as the Register Transfer Level (RTL).
  • Verilog supports all of those levels.
  • A strong feature of the Verilog HDL is that an equivalent language for describing, testing and debugging your system will be used.

VERILOG FEATURES

The features of Verilog language are given below.

  • Robust Background: Supported by OVI, and standardized in 1995 as IEEE STD 1364
  • Industrial support: Fast simulation and effective synthesis (85% were utilized in ASIC foundries by engineering science TIMES)
  • Universal: Allows entire method in one style surroundings (including analysis and verification)
  • Extensibility: Verilog PLI that enables for extension of Verilog capabilities

DESIGN STYLES

Verilog permits the designers to design a design in either Bottom-up or Top-down methodology.

BOTTOM-UP DESIGN

The traditional methodology of electronic style is bottom-up. Every style is performed at the gate-level victimization the quality gates with increasing quality of latest styles this approach is almost not possible to take care of. New systems include ASIC or microprocessors with a quality of thousands of transistors. These ancient bottom-up styles need to crumple to new structural, graded style ways. While not these new style practices it might be not possible to handle the new quality.

Fig 4.1:  Bottom-Up design methodology

TOP-DOWN DESIGN

This is also one of the methods for implementing a digital design. A top–down design contains design specification, high level design , low level design, Register transfer level coding ,functional verification of coding , synthesis of coding  , gate level simulation of coding , placement , routing and finally implementation. Thus the top down design follows the following sequence shown in figure 4.2.

   Fig. 4.2 Top-Down design approach

Fig 4.3 Top-Down design methodology

The desired design-style of all designers is the top-down design. A real top down design allows early testing, easy change of different technologies, and a structured system design and offers many other advantages. But it is very difficult to follow a pure top-down design. Due to this fact most designs are mix of both the methods, implementing some key elements of both design styles.

Fig: 4.4 Creating New Project

The Fig: 4.4 shows creation of a new project on project window. It shows the project name & location shown in figure.

Fig.4.5 Selecting Project Settings

The Fig: 4.5 shows selection of a project on project window .It specifies the type of device and project properties shown in figure.

Fig.4.6 Verifying Project Summary

The Fig: 4.6 show the project navigator it will create a new project with the desired specifications.

Fig 4.7: A project Device window showing Xilinx properties

A new project device window was shown targeting exact Xilinx properties and other features of ISE system were obtained shown in fig: 4.7

Fig 4.8: Obtaining Verilog Module

The verilog source file is added to the hierarchy to the source file. The Verilog code for the design is written and saved shown in fig: 4.8

 

4.6.2 MODELSIM SE 6.4 C

The ModelSim is a specialized simulator from mentor graphics to simulate the VHDL code for Xilinx FPGAs. It is a very good simulation tool, which can simulate the code written in various HDLs such as Verilog, VHDL etc. It also supports various design flows such as HDL, Schematic etc. The beauty of this tool is that the simulation results achieved are matched with the actual results. Thus it is assured about the result even before implementing the design on FPGA.  HDL simulation is just a part of verification. The figure below illustrates the assorted flows which will have an effect on your simulation

 

Fig 4.7 Modelsim Simulation Environment

The HDL code could also be inefficient; check bench languages and third party rectify tools could slow the simulation; Third party information processing could also be un-optimized. Contemplate that third party check bench implementation alone will account for bigger than 80 % of the general simulation performance. If this can be the case you ought to contemplate investigation the rationale 80 % of the time is being spent on the check bench and its interface. ModelSim incorporates a profiler which will assist you establish that a part of your surroundings is impacting simulation performance the foremost.

The ModelSim .Ini file is browse whenever the compiler is invoked and incorporates a setting to enable/disable the default performance mode. This option is V opt Flow = 1. Once this can be set to a zero you alter the pre-6.0 ModelSim flow, once this can be set to at least one, you have interaction a brand new performance out of box flow. With the 6.0 unharnessed the default is zero. Presently this no mandatory performance flow is extremely helpful for pure Verilog styles wherever you’re not inquisitive about debugging, or for Mixed Verilog and VHDL styles. All the Verilog during a style are going to be optimized, in spite of wherever within the hierarchy it’s set. Yet again this will improve performance up to 10x versus non-optimized mode. Another thought with the new flow is what a part of the simulation you’re activity. ModelSim has separate compilation, improvement and simulation steps. What is more vsim will mechanically invoke vopt if it’s not already been run one by one.

Simulation could be a two-phase method. Throughout part one; ModelSim generates native code for your specific OS. Throughout part two, ModelSim runs the native code. You’ll gain the foremost correct performance statistics by activity the elaboration part and run part one by one. As mentioned below, you’ll use the –elab switch or the note command to live these 2 phases severally.

MODELSIM LIBRARY

A library could be a location wherever information to be used for simulation is keep. Libraries square measure ModelSim manner of managing the creation of information before it’s required to be used in simulation. It additionally is the way to contour simulation invocation. Rather than compilation all style information each and every time you simulate, ModelSim uses binary pre-compiled information from these libraries. So, if you create a change to one Verilog module, solely that module is recompiled, instead of all modules within the style.

 

 

 

 

 

 

 

 

 

 

 

 

 

BASIC SIMULATION FLOW

 

Fig 4.8 flow of basic simulation

This subsequent figure 4.8 gives fundamental steps for simulating the design in ModelSim. Simulation is nothing but logical verification, for verifying the functional logic ModelSim or Xilinx software is used. The basic simulation flow contains the first step is creating a working library after this files are compiled. After compilation functional verification is done then debugging.

CREATING THE OPERATING LIBRARY

In ModelSim, all styles are they VHDL, Verilog, or a mixture of the two, square measure compiled into a library. You usually begin a brand new simulation in ModelSim by making a operating library known as “work”. “Work” is that the library name utilized by the compiler because the default destination for compiled style units.

COMPILING YOUR STYLE

Once making the operating library, it can compile these style units into it. It is compatible to all platforms.  Simulate this style on any another platform while not having to compile again this style.

 

 

 

RUNNING THE SIMULATION

With the planning compiled, you invoke the machine on a top-ranking module. Forward the planning masses with success, the simulation time is about to zero, and you enter a run option to start simulation.

  • Create an operating library
  • Compile style files
  • Run simulation
  • Debug results

ModelSim is an OS, Linux, and Windows-based simulation and rectify surroundings, combining high performance with the foremost powerful and intuitive interface within the trade.

 

FEATURES

  • Unified Coverage info (UCDB) that could be a hub for analyszing, viewing, merging, managing and news for coverage of all info.
  • Source Annotation. The supply window is often enabled to show the objects with their values throughout the simulation.
  • FSM for each Verilog and VHDL is currently supporting.
  •  The result of Code Coverage will currently be reviewed the post-simulation mistreatment the graphical user surroundings.
  • Simulation messages square measure currently logged within the WLF file.
  • System is currently supported for x86 UNIX system 64-bit platforms.
  • The interface rectifies and analysis surroundings continues evolving to  produce bigger user customization and higher performance.
  • System Verilog is used for style support’s to expand with several new constructs addition to this unharnessed.
  • Mixed HDL simulation option
  • Job Spy Regression Monitor

 

 

BENEFITS

  • The best mixed-language surroundings and performance within the trade.
  • The intuitive interface makes it simple to look at and access the various powerful capabilities of ModelSim. There’s no learning curve because the rectify surroundings is common across all languages.
  • All ModelSim product square measure 100% standards based mostly. This suggests your investment is protected, risk is down, employ is enabled, and productivity is increased.
  • Award-winning technical support.
  • Merging, ranking and reporting of code coverage for tracking verification progress.
  • Intuitive GUI for efficient interactive  or post simulation debug of RTL  and gate level design
  • High performance HDL simulation for FPGA & ASIC design teams.
  • ModelSim SE combines high performance and high capability with the foremost advanced code coverage and debugging capabilities within the trade. ModelSim SE offers unmatched flexibility by supporting thirty two and sixty four bit OS and UNIX system and thirty two bit Windows®-based platforms.

 

 

 

 

 

 

 

 

CHAPTER 5

SYNTHESIS & SIMULATION RESULTS AND REPORTS

5.1. BEHAVIORAL SIMULATION RESULTS

The coding for our data encoding schemes was written in Verilog HDL. The simulation was performed in either Xilinx ISE simulator or ModelSim software to verify the working of the design and to check if the design is working properly or not by taking into account the modules written in Verilog. The inputs are given and the outputs are checked in parallel if they are suitable.

A. SCHEME2 RESULTS:

Fig.5.1: Scheme 2

 

B. SCHEME3 RESULTS

Fig.5.2: Scheme 3

5.2. DESIGN SUMMARY

5.2.1. SCHEME 2

Table 5.2.1: Existing Method Design Summary

5.2.2. SCHEME 3

Table 5.2.2: Proposed Method Design Summary

5.3. SYNTHESIS RESULTS

In this stage, after applying the constraints to the design, a technology mapped gate level net-list is obtained as output, and also the area, timing, qor reports are generated at this step of the procedure.

5.3.1. TIMING REPORT AND ANALYSIS OF EXISTING METHOD

Synthesis report  for encoder:

Timing Summary:-

Speed Grade: -4

Minimum period: 23.108ns (Maximum Frequency: 43.275MHz)

Minimum input arrival time before clock: 26.481ns

Maximum output required time after clock: 7.488ns

Maximum combinational path delay: No path found

Timing Detail:

All values displayed in nanoseconds (ns)

==================================================================Timing constraint: Default period analysis for Clock ‘clk’

Clock period: 23.108ns (frequency: 43.275MHz)

Total number of paths / destination ports: 3783530 / 18

————————————————————————-

Delay:               23.108ns (Levels of Logic = 16)

Source:            data_out_1_3 (FF)

Destination:       data_out_1_14 (FF)

Source Clock:      clk rising

Destination Clock: clk rising

Data Path: data_out_1_3 to data_out_1_14

Gate     Net

Cell:in->out      fanout   Delay   Delay  Logical Name (Net Name)

—————————————-  ————

FDC:C->Q              8   0.720   1.151  data_out_1_3 (data_out_1_3)

LUT4:I2->O           10   0.551   1.160  input_data[2].ty_module_inst/t2 (input_data[2].ty_module_inst/t2)

LUT4:I3->O            1   0.551   0.827  ones_count_inst_2/Madd_ones_reg_addsub0004_Madd_cy<0>11_SW0 (N356)

LUT4_D:I3->O          5   0.551   0.989  ones_count_inst_2/Madd_ones_reg_addsub0004_Madd_cy<0>11 (ones_count_inst_2/Madd_ones_reg_addsub0004_Madd_cy<0>)

LUT4:I2->O            3   0.551   0.933  ones_count_inst_2/Madd_ones_reg_addsub0010_Madd_xor<1>11_SW0 (N407)

LUT4_D:I3->O          2   0.551   0.903  ones_count_inst_2/Madd_ones_reg_addsub0010_Madd_xor<1>11 (ones_count_inst_2/Madd_ones_reg_addsub0012_Madd_lut<1>)

LUT4_D:I3->O          3   0.551   0.975  ones_count_inst_2/Madd_ones_reg_addsub0012_Madd_cy<1>11 (ones_count_inst_2/Madd_ones_reg_addsub0012_Madd_cy<1>)

LUT4_D:I2->O          2   0.551   0.903  ones_count_inst_2/Madd_ones_reg_addsub0015_cy<2>11 (ones_count_inst_2/Madd_ones_reg_addsub0015_cy<2>)

LUT4:I3->O            2   0.551   0.877  ones_count_inst_2/ones_reg<3>1 (ones_t2<3>)

MUXCY:DI->O           1   0.889   0.000  module_a_inst/Msub_eq_1_cy<3> (module_a_inst/Msub_eq_1_cy<3>)

MUXCY:CI->O           0   0.064   0.000  module_a_inst/Msub_eq_1_cy<4> (module_a_inst/Msub_eq_1_cy<4>)

XORCY:CI->O           5   0.904   1.116  module_a_inst/Msub_eq_1_xor<5> (module_a_inst/eq_1<5>)

LUT2:I1->O            1   0.551   0.000  module_a_inst/Mcompar_cond2_cmp_lt0000_lut<5> (module_a_inst/Mcompar_cond2_cmp_lt0000_lut<5>)

MUXCY:S->O            2   0.739   0.903  module_a_inst/Mcompar_cond2_cmp_lt0000_cy<5> (module_a_inst/Mcompar_cond2_cmp_lt0000_cy<5>)

LUT4:I3->O            1   0.551   0.827  module_a_inst/full_invert_140 (module_a_inst/full_invert_140)

LUT4:I3->O           17   0.551   1.413  module_a_inst/full_invert_1226 (full_invert)

LUT3:I2->O            1   0.551   0.000  Mxor_data_out_1_8_xor0000_Result1 (data_out_1_8_xor0000)

FDC:D                     0.203          data_out_1_8

—————————————-

Total                     23.108ns (10.131ns logic, 12.977ns route)

(43.8% logic, 56.2% route)

==================================================================

Timing constraint: Default OFFSET IN BEFORE for Clock ‘clk’

Total number of paths / destination ports: 10206891 / 18

————————————————————————

Offset:              26.481ns (Levels of Logic = 18)

Source:            reset (PAD)

Destination:       data_out_1_14 (FF)

Destination Clock: clk rising

Data Path: reset to data_out_1_14

Gate     Net

Cell:in->out      fanout   Delay   Delay  Logical Name (Net Name)

—————————————-  ————

IBUF:I->O           113   0.821   2.656  reset_IBUF (reset_IBUF)

LUT3:I0->O            2   0.551   1.216  input_data[2].t4_2_module_inst/t4_2_out_SW1 (N228)

LUT4:I0->O           10   0.551   1.160  input_data[2].ty_module_inst/t2 (input_data[2].ty_module_inst/t2)

LUT4:I3->O            1   0.551   0.827  ones_count_inst_2/Madd_ones_reg_addsub0004_Madd_cy<0>11_SW0 (N356)

LUT4_D:I3->O          5   0.551   0.989  ones_count_inst_2/Madd_ones_reg_addsub0004_Madd_cy<0>11 (ones_count_inst_2/Madd_ones_reg_addsub0004_Madd_cy<0>)

LUT4:I2->O            3   0.551   0.933  ones_count_inst_2/Madd_ones_reg_addsub0010_Madd_xor<1>11_SW0 (N407)

LUT4_D:I3->O          2   0.551   0.903  ones_count_inst_2/Madd_ones_reg_addsub0010_Madd_xor<1>11 (ones_count_inst_2/Madd_ones_reg_addsub0012_Madd_lut<1>)

LUT4_D:I3->O          3   0.551   0.975  ones_count_inst_2/Madd_ones_reg_addsub0012_Madd_cy<1>11 (ones_count_inst_2/Madd_ones_reg_addsub0012_Madd_cy<1>)

LUT4_D:I2->O          2   0.551   0.903  ones_count_inst_2/Madd_ones_reg_addsub0015_cy<2>11 (ones_count_inst_2/Madd_ones_reg_addsub0015_cy<2>)

LUT4:I3->O            2   0.551   0.877  ones_count_inst_2/ones_reg<3>1 (ones_t2<3>)

MUXCY:DI->O           1   0.889   0.000  module_a_inst/Msub_eq_1_cy<3> (module_a_inst/Msub_eq_1_cy<3>)

MUXCY:CI->O           0   0.064   0.000  module_a_inst/Msub_eq_1_cy<4> (module_a_inst/Msub_eq_1_cy<4>)

XORCY:CI->O           5   0.904   1.116  module_a_inst/Msub_eq_1_xor<5> (module_a_inst/eq_1<5>)

LUT2:I1->O            1   0.551   0.000  module_a_inst/Mcompar_cond2_cmp_lt0000_lut<5> (module_a_inst/Mcompar_cond2_cmp_lt0000_lut<5>)

MUXCY:S->O            2   0.739   0.903  module_a_inst/Mcompar_cond2_cmp_lt0000_cy<5> (module_a_inst/Mcompar_cond2_cmp_lt0000_cy<5>)

LUT4:I3->O            1   0.551   0.827  module_a_inst/full_invert_140 (module_a_inst/full_invert_140)

LUT4:I3->O           17   0.551   1.413  module_a_inst/full_invert_1226 (full_invert)

LUT3:I2->O            1   0.551   0.000  Mxor_data_out_1_8_xor0000_Result1 (data_out_1_8_xor0000)

FDC:D                     0.203          data_out_1_8

—————————————-

Total                     26.481ns (10.783ns logic, 15.698ns route)

(40.7% logic, 59.3% route)

==================================================================

Timing constraint: Default OFFSET OUT AFTER for Clock ‘clk’

Total number of paths / destination ports: 18 / 18

————————————————————————-

Offset:              7.488ns (Levels of Logic = 1)

Source:            data_out_1_10 (FF)

Destination:       data_out<10> (PAD)

Source Clock:      clk rising

Data Path: data_out_1_10 to data_out<10>

Gate     Net

Cell:in->out      fanout   Delay   Delay  Logical Name (Net Name)

FDC:C->Q              9   0.720   1.124  data_out_1_10 (data_out_1_10)

OBUF:I->O                 5.644          data_out_10_OBUF (data_out<10>)

—————————————-

Total                      7.488ns (6.364ns logic, 1.124ns route)

(85.0% logic, 15.0% route)

==================================================================

Total REAL time to Xst completion: 22.00 secs

Total CPU time to Xst completion: 22.44 secs

Total memory usage is 202944 kilobytes

5.3.2. TIMING REPORT AND ANALYSIS OF PROPOSED METHOD

Timing Summary:

Speed Grade: -4

Minimum period: 23.935ns (Maximum Frequency: 41.780MHz)

Minimum input arrival time before clock: 27.256ns

Maximum output required time after clock: 7.498ns

Maximum combinational path delay: No path found

Timing Detail:

All values displayed in nanoseconds (ns)

==================================================================

Timing constraint: Default period analysis for Clock ‘clk’

Clock period: 23.935ns (frequency: 41.780MHz)

Total number of paths / destination ports: 8530038 / 18

————————————————————————-

Delay:               23.935ns (Levels of Logic = 16)

  Source:            data_out_1_11 (FF)

  Destination:       data_out_1_13 (FF)

  Source Clock:      clk rising

  Destination Clock: clk rising

  Data Path: data_out_1_11 to data_out_1_13

                                Gate     Net

    Cell:in->out      fanout   Delay   Delay  Logical Name (Net Name)

—————————————-  ————

FDC:C->Q              6   0.720   1.071  data_out_1_11 (data_out_1_11)

LUT4_D:I2->O          8   0.551   1.278  input_data[11].te_module_inst/t21 (input_data[11].te_module_inst/t2)

LUT4:I1->O            5   0.551   0.921  ones_count_inst_3/Madd_ones_reg_addsub0012_Madd_lut<0>1_SW0 (N224)

MUXF5:S->O            1   0.621   0.869  ones_count_inst_3/Madd_ones_reg_addsub0012_Madd_lut<0>1_SW3 (N417)

LUT4:I2->O            2   0.551   0.945  ones_count_inst_3/Madd_ones_reg_addsub0012_Madd_xor<1>11_SW0_SW1 (N338)

LUT4:I2->O            1   0.551   0.827  ones_count_inst_3/Madd_ones_reg_addsub0012_Madd_xor<1>11_SW0 (N483)

LUT4:I3->O            3   0.551   0.975  ones_count_inst_3/Madd_ones_reg_addsub0015_cy<1>11 (ones_count_inst_3/Madd_ones_reg_addsub0015_cy<1>)

LUT4_D:I2->O          2   0.551   0.903  ones_count_inst_3/Madd_ones_reg_addsub0015_cy<2>11 (ones_count_inst_3/Madd_ones_reg_addsub0015_cy<2>)

LUT4:I3->O            2   0.551   0.877  ones_count_inst_3/ones_reg<3>1 (ones_t2<3>)

MUXCY:DI->O           1   0.889   0.000  module_c_inst/Msub_eq_1_cy<3> (module_c_inst/Msub_eq_1_cy<3>)

MUXCY:CI->O           0   0.064   0.000  module_c_inst/Msub_eq_1_cy<4> (module_c_inst/Msub_eq_1_cy<4>)

XORCY:CI->O          18   0.904   1.612  module_c_inst/Msub_eq_1_xor<5> (module_c_inst/eq_1<5>)

LUT2:I1->O            1   0.551   0.000  module_c_inst/Mcompar_cond4_cmp_gt0000_lut<5> (module_c_inst/Mcompar_cond4_cmp_gt0000_lut<5>)

MUXCY:S->O           12   0.739   1.144  module_c_inst/Mcompar_cond4_cmp_gt0000_cy<5> (module_c_inst/Mcompar_cond4_cmp_gt0000_cy<5>)

LUT4:I3->O            1   0.551   0.827  module_c_inst/even_invert_1_and00024_SW1 (N86)

LUT4:I3->O           18   0.551   1.485  module_c_inst/even_invert_1_and00024 (module_c_inst/N29)

LUT4:I2->O            1   0.551   0.000  Mxor_data_out_1_9_xor0000_Result1 (data_out_1_9_xor0000)

FDC:D                     0.203          data_out_1_9

—————————————-

Total                     23.935ns (10.201ns logic, 13.734ns route)

(42.6% logic, 57.4% route)

==================================================================

Timing constraint: Default OFFSET IN BEFORE for Clock ‘clk’

Total number of paths / destination ports: 25117727 / 18

————————————————————————-

Offset:              27.256ns (Levels of Logic = 18)

Source:            reset (PAD)

Destination:       data_out_1_13 (FF)

Destination Clock: clk rising

Data Path: reset to data_out_1_13

Gate     Net

Cell:in->out      fanout   Delay   Delay  Logical Name (Net Name)

—————————————-  ————

IBUF:I->O           115   0.821   2.524  reset_IBUF (reset_IBUF)

LUT3:I1->O            2   0.551   1.216  input_data[11].t4_2_module_inst/t4_2_out1_SW1 (N356)

LUT4_D:I0->O          8   0.551   1.278  input_data[11].te_module_inst/t21 (input_data[11].te_module_inst/t2)

LUT4:I1->O            5   0.551   0.921  ones_count_inst_3/Madd_ones_reg_addsub0012_Madd_lut<0>1_SW0 (N224)

MUXF5:S->O            1   0.621   0.869  ones_count_inst_3/Madd_ones_reg_addsub0012_Madd_lut<0>1_SW3 (N417)

LUT4:I2->O            2   0.551   0.945  ones_count_inst_3/Madd_ones_reg_addsub0012_Madd_xor<1>11_SW0_SW1 (N338)

LUT4:I2->O            1   0.551   0.827  ones_count_inst_3/Madd_ones_reg_addsub0012_Madd_xor<1>11_SW0 (N483)

LUT4:I3->O            3   0.551   0.975  ones_count_inst_3/Madd_ones_reg_addsub0015_cy<1>11 (ones_count_inst_3/Madd_ones_reg_addsub0015_cy<1>)

LUT4_D:I2->O          2   0.551   0.903  ones_count_inst_3/Madd_ones_reg_addsub0015_cy<2>11 (ones_count_inst_3/Madd_ones_reg_addsub0015_cy<2>)

LUT4:I3->O            2   0.551   0.877  ones_count_inst_3/ones_reg<3>1 (ones_t2<3>)

MUXCY:DI->O           1   0.889   0.000  module_c_inst/Msub_eq_1_cy<3> (module_c_inst/Msub_eq_1_cy<3>)

MUXCY:CI->O           0   0.064   0.000  module_c_inst/Msub_eq_1_cy<4> (module_c_inst/Msub_eq_1_cy<4>)

XORCY:CI->O          18   0.904   1.612  module_c_inst/Msub_eq_1_xor<5> (module_c_inst/eq_1<5>)

LUT2:I1->O            1   0.551   0.000  module_c_inst/Mcompar_cond4_cmp_gt0000_lut<5> (module_c_inst/Mcompar_cond4_cmp_gt0000_lut<5>)

MUXCY:S->O           12   0.739   1.144  module_c_inst/Mcompar_cond4_cmp_gt0000_cy<5> (module_c_inst/Mcompar_cond4_cmp_gt0000_cy<5>)

LUT4:I3->O            1   0.551   0.827  module_c_inst/even_invert_1_and00024_SW1 (N86)

LUT4:I3->O           18   0.551   1.485  module_c_inst/even_invert_1_and00024 (module_c_inst/N29)

LUT4:I2->O            1   0.551   0.000  Mxor_data_out_1_9_xor0000_Result1 (data_out_1_9_xor0000)

FDC:D                     0.203          data_out_1_9

—————————————-

Total                     27.256ns (10.853ns logic, 16.403ns route)

(39.8% logic, 60.2% route)

==================================================================

Timing constraint: Default OFFSET OUT AFTER for Clock ‘clk’

  Total number of paths / destination ports: 18 / 18

————————————————————————-

Offset:              7.498ns (Levels of Logic = 1)

  Source:            data_out_1_2 (FF)

  Destination:       data_out<2> (PAD)

  Source Clock:      clk rising

  Data Path: data_out_1_2 to data_out<2>

                                Gate     Net

    Cell:in->out      fanout   Delay   Delay  Logical Name (Net Name)

    —————————————-  ————

     FDC:C->Q             10   0.720   1.134  data_out_1_2 (data_out_1_2)

     OBUF:I->O                 5.644          data_out_2_OBUF (data_out<2>)

    —————————————-

Total                      7.498ns (6.364ns logic, 1.134ns route)

(84.9% logic, 15.1% route)

==================================================================

Total REAL time to Xst completion: 34.00 secs

Total CPU time to Xst completion: 34.28 secs

Total memory usage is 236288 kilobytes

CHAPTER 6

6.1. APPLICATIONS

Routers and link architectures, in communication subsystem we seen power dissipation in routers and link architecture. By using different encoded and decoded scheme we reduce power dissipation in routers and link architecture. We can use the same technique where we have to use power hungry wires.

6.2. ADVANTAGES

  • Save power up to maximum extent
  • Reduction of normal and switching activities as well

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CONCLUSION

We have implemented two schemes proposed in the paper (namely scheme-2 and scheme-3) to minimize the coupling transaction activity and normal transaction activity in Network-on-Chips links. By these proposed schemes, we are taking care of that the consecutive bits not to have opposite values so that coupling switching transition activity is reduced.

In the same way, the bits routed through specific links are encoded in a way such that toggling (opposite previous and present values) of the bit values in that particular links is prevented. It is the way to reduce the switching activity.

Note that in the Table-1 of the IEEE paper, our motto is to covert Type-1 and Type-2 to Type-3 and Type-4 bit combinations as far as possible. This is because, Type-3 and Type-4 combinations leads to lower coupling switching and normal switching behavior.

Out of the two schemes those are implemented, scheme-3 has even lesser coupling switching and normal switching activity compared to scheme-2. This was achieved by the inclusion of “even inversion” module “TE module” in the Scheme-3. By this, Type-1 and Type-2 bit combinations are converted in to Type-3 and Type-4 bit combinations mentioned in Table 1. Whereas Scheme-2 converts some Type-1 bit combinations to Type-2 bit combinations. So, as we discussed earlier, Scheme-3 is more optimized than Scheme-2.

 

 

 

 

 

REFERENCES

  1. Zarandi, A.A.E.Dept. of Comput. Eng., Islamic Azad Univ., Tehran, Iran Molahosseini, A.S. ; Hosseinzadeh, M. ; Sorouri, S. ; Antao, S. ; Sousa, L.”Reverse Converter Design via Parallel-Prefix Adders: Novel Components, Methodology, and Implementations” in Very Large Scale Integration (VLSI) Systems, IEEE Transactions on (Volume:23 ,  Issue: 2 ), February 2014.
  2. M. S. Rahaman and M. H. Chowdhury, “Crosstalk avoidance and errorcorrection coding for coupled RLC interconnects,” in Proc. IEEE Int. Symp. Circuits Syst., May 2009, pp. 141–144.
  3. W. Wolf, A. A. Jerraya, and G. Martin, “Multiprocessor system-on-chip MPSoC technology,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 27, no. 10, pp. 1701–1713, Oct. 2008.
  4. L. Benini and G. De Micheli, “Networks on chips: A new SoC paradigm,” Computer, vol. 35, no. 1, pp. 70–78, Jan. 2002.
  5. S. E. Lee and N. Bagherzadeh, “A variable frequency link for a poweraware network-on-chip (NoC),” Integr. VLSI J. , vol. 42, no. 4,pp. 479– 485, Sep. 2009.
  6. D.Yeh,L.S.Peh,S.Borkar,J.Darringer,A.Agarwal,andW.M.Hwu,“Thousandcore chips roundtable,” IEEE Design Test Comput., vol. 25,no. 3, pp. 272–278, May–Jun. 2008.
  7. A. Vittal and M. Marek-Sadowska, “Crosstalk reduction for VLSI,”IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. , vol. 16, no. 3,pp. 290–298, Mar. 1997.
  8. M. Ghoneima, Y. I. Ismail, M. M. Khellah, J. W. Tschanz,and V. De,“Formal derivation of optimal active shielding for low-power onchip buses,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst. , vol. 25,no. 5, pp. 821–836, May 2006.
  9. L. Macchiarulo, E. Macii, and M. Poncino, “Wire placement for crosstalk energy minimization in address buses,” in Proc. Design Autom.Test Eur. Conf. Exhibit. , Mar. 2002, pp. 158–162.
  10. R. Ayoub and A. Orailoglu, “A unified transformational approach for reductions in fault vulnerability, power, and crosstalk noise and delay on processor buses,” in Proc. Design Autom. Conf. Asia South Pacific, vol. 2. Jan. 2005, pp. 729–734.
  11. K. Banerjee and A. Mehrotra, “A power-optimal repeater insertion methodology for global inter connects in nanometer designs,” IEEE Trans. Electron Devices , vol. 49, no. 11, pp. 2001–2007, Nov. 2002.
  12. M. R. Stan and W. P. Burleson, “Bus-invert coding for low-power I/O,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst. vol. 3, no. 1,pp. 49– 58, Mar. 1995.

[13] K. W. Ki, B. Kwang Hyun, N. Shanbhag, C. L. Liu, and K. M. Sung, “Coupling-driven signal encoding scheme for low-power interface design,” in Proc. IEEE/ACM Int. Conf. Comput.-Aided Design,Nov. 2000, pp. 318–321.

 

 

 

 

 

 

 

 

 

 

Cite This Work

To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Related Services

View all

DMCA / Removal Request

If you are the original writer of this dissertation and no longer wish to have your work published on the UKDiss.com website then please: