Sunteți pe pagina 1din 6

Research Proposal

RESEARCH PROPOSAL

Naseem Abbas naseemabbas18@gmail.com +92-333-2657195

A Framework for Fault Tolerant Real Time Systems Based on Reconfigurable FPGAs

1.

Abstract

To increase the amount of logic available to the user in SRAM-based FPGAs, manufacturers are using nanometric technologies to boost logic density and reduce cost, making its use more attractive. However, these technological improvements also make FPGAs particularly vulnerable to configuration memory bit-flips caused by the power fluctuations, strong electromagnetic field and radiation. This issue is particularly sensitive because of the increasing amount of configuration memory cells needed to define their functionality. One possible solution to this problem is to use radiation-hardened FPGAs, but since these devices are very expensive, alternative solutions allowing using non radiation hardened devices are currently investigated. FPGAs are sensitive to both heavy ion and proton induced single event upsets (SEUs). Single event upsets in the FPGA affect the user design flip-flops, the FPGA configuration bitstream, and any hidden FPGA registers, latches, or internal state. Configuration bitstream upsets are especially important because such upsets affect both the state and operation of the design. Configuration upsets may perturb the routing resources and logic functions in a way that changes the operation of the circuit. Electronic circuits can be designed to tolerate high levels of radiation through custom manufacturing techniques. 2. Introduction and Motivation

The introduction of Very Large Scale Integration (VLSI) technologies increased substantially the reliability of electronic systems, when compared with the previous use of discrete components. Hence, the use of fault tolerance techniques was confined only to specific applications requiring high levels of reliability or operating on harsh environments. Shrinking transistors size leads to a greater integration and to a per unit power reduction, enabling chip to grow both in size and complexity. But new nanometer scales also brought negative aspects, such as a high probability of occurrence of memory bit-flips, caused by the power fluctuations, electromagnetic interferences or radiation. This issue has a particular impact on the reliability of SRAM-based Field Programmable Gate Arrays (FPGAs). The exponential growth in the number of memory cells needed for configuration purposes makes them especially vulnerable to memory bit-flips, resulting on Single Event Upsets (SEU) and Multi-Bit Upsets (MBU). Despite faults due to memory bit-flips do not physically damage the chip, their effects are permanent, since the functionality of circuits mapped into the device is permanently altered. Although anti-fuse technology-based FPGAs are less prone to SEUs due to the absence of configuration memory cells, SRAM-based FPGAs have been the proffered choice, for instance, in space missions, like MARS 2003 Lander and Rover vehicles, where they were exposed to

1/4

Research Proposal

extremely harsh conditions. Thats because their processing performance is 10 to 100 times better than the performance obtained by anti-fused technology based FPGAs, and also due to their reconfigurable features, which enable resource multiplexing, updating of algorithms during long space missions, avoiding mission obsolescence, and correction of design flaws in orbit. In non-reconfigurable technologies, such as ASICs, protection against SEUs is restricted to flip flops, because logic paths between them are typically hardwired. Notwithstanding, Single Event Transients (SETs), a charge transient induced in a wire by the incidence of an heavy ion, may be propagated to flip-flop inputs, where they have a high probability to be registered causing soft-errors in the user data. Besides if it strikes a clock line, double-clocking may occur leading to an extemporaneous update that may effect, depending upon the charge value and online attenuation, part of or all the flip-flops driven by that line. Further protection is only achieved through full module redundancy. This is also a preferred choice to improve the reliability of highly critical real-time applications based on FPGAs. Due to their inherent configurability, FPGAs are especially suitable for implementation of modular redundancy, since it does not require any new architectural feature and it is function independent. However, their dependency on memory cells to define logic paths makes these also susceptible to SEUs. Again, in this case, the only effective protection is full module redundancy. In a discrete implementation of a triple modular redundancy (TMR) system, if a defect affects the functionality of on module, the reliability index of the system decreases, but the system still works correctly. In this method, extra components are used to instantaneously mask the effect of a faulty component, meaning that no propagation of fault will occur. However, a second failure in remaining modules may lead to a system failure. Ideally, when a module fails, it should be replaced to restore the initial system redundancy index, but this action may not be possible immediately. In certain cases, like in space applications, it may even be impossible. With FPGAs this drawback may be overcome without significant rise in cost, because, in the event of module failure, the initial system redundancy index may be restored just by performing a reconfiguration of the affected module. No physical replacement is therefore necessary. The aim of this research is to define a set of rules for a new framework for implementing highly critical real-time integrated systems based on dynamically reconfiguration FPGAs. The aim will be to make these systems immune to faults emerging from memory bit-flips, by confining, detecting, locating and mitigating them. This approach enables the confinement and detection of faulty modules, and the determination of when reconfiguration must be applied to restore proper system operation before cumulative errors, induced over time, leads to its failure. A short survey of the most recent data published concerning the impact of radiation induced faults on FPGAs and on FPGA based TMR implementation is reviewed to support the options assumed during the implementation phase. An important function of the earths atmosphere is to filter the ionizing radiation found in space. Without the atmosphere, the earth would be subject to the high energy radiation found in space. The radiation found in most earth orbits is caused by protons and heavy ions emitted by the sun (i.e. solar particles), galactic cosmic rays, and particles trapped in the earths magnetic field. Space radiation has both long-term and single particle effects on electronic components. Longterm effects include total ionizing dose (TID). Single-event effects include single-event latchup (SEL) and single-event upset (SEU). Each of these effects must be considered before using a device in a space application
Total Ionizing Dose (TID) Total ionizing dose is the long term ionizing damage to a semiconductor device caused by high energy protons and electrons. Exposure to high-energy ionizing radiation generates electron-hole pairs within the oxide of a

2/4

Research Proposal

MOS device. These generated carriers cause a buildup of charge within the oxide. This buildup of charge will change the threshold voltage, increase the leakage current, and modify the timing of the MOS transistors. Ultimately, ionizing radiation will cause functional failures within the device. Single-Event Latchup (SEL) Single-event latchup is a potentially destructive condition in which a single charged particle induces latchup within a CMOS device. With enough energy, a charged particle may trigger the parasitic npn-pnp circuit found within CMOS circuits. Once in latchup, high currents will flow through the parasitic bi-polar transistors and destroy the device. Single-Event Upsets (SEU) A single-event upset is the change in state of a digital memory element caused by an ionizing particle. As the ionizing particle passes through the device, charge can be transferred from one node to another. This charge transfer can lower the voltage of a memory cell and change its internal state. These single-event upsets are soft errors that do not cause any permanent damage within the device.

amount of memory state within a relatively small amount of circuit area. Much like SRAM and DRAM, SRAM-based FPGAs contain large amounts of memory cells within a device and are especially sensitive to radiation induced SEUs. As suggested in Table 2, the Virtex V1000 FPGA contains almost six-million bits of internal state. This known internal state is used for the following important purposes: User Flip-Flops An important architectural component of all FPGAs are user programmable flipflops. User designs exploit these flip-flops to implement common sequential logic circuits such as state machines, counters, and registers. User flip-flops in most digital technologies are susceptible to radiation-induced single-event upsets. Many digital circuits operating in a radiation environment exploit redundancy (i.e. multiple flip-flops) to mitigate against such single-event effects User Memory Modern FPGAs provide blocks of internal memory larger than the typical look-up table. This block memory is used for traditional random access memory functions such as data storage, buffering, FIFO, etc. The Virtex family includes a set of internal dual-ported BlockRAM memories that provide 4096- bits of randomly accessible memory. Dense static memory such as the BlockRAM is especially susceptible to radiation-induced SEUs. Well-known error-correction coding techniques are often used within a user design to detect and correct such upsets[8]. Configuration Memory A large amount of memory cells are required to define the operation of the user designed FPGA circuit. These memory cells define the operation of the configurable logic blocks, routing resources, input/output blocks, and other programmable FPGA resources. The use of static memory cells for configuration storage allows the device to be reprogrammed as often as necessary by reloading a new configuration memory. Like other static memory cells, configuration memory is susceptible to single-event upsets. Upsets within the configuration memory are especially troublesome as they may change the operation of the circuit. Several techniques have been proposed for detecting and mitigating such upsets. Half-Latches Another form of internal state found within the Virtex FPGA is the half-latch structure. Half-latch structures are used to generate many of the constant 0 and 1 logic values used throughout a

3/4

Research Proposal

user FPGA design. For example, the half-latch in Figure 1 generates a constant 1 for a clockenable signal of a user flip-flop. Unlike other internal state, halflatches are not visible to the circuit or the user. Because of this lack of visibility, upsets within a half-latch cannot be detected. To prevent undetectable half-latch upsets from occurring, half-latch structures must be removed from a design[9]. BACKGROUND FIELD programmable gate arrays (FPGAs) have been successfully used for more than a decade. The usage of SRAM-based programmable devices is convenient because of their high flexibility in achieving multiple requirements such as cost, performance, turnaround time, etc. Despite these attractive characteristics, few non-radiation hardened SRAM-based programmable devices have been used for safety- and mission- critical applications until now, due to their sensitivity to Single Event Upsets (SEUs) induced by radiation. when a flux of highly energized particles hits the surface of non-radiation-hardened SRAM-based FPGAs the mapped circuit can change its behavior even drastically. In SRAM-based FPGAs, the mapped circuit is indeed totally controlled by the configuration memory, which is composed of static RAM cells. Interestingly, the effects induced by a SEU affecting the configuration memory are permanent, since the SEU changes the mapped circuit until the device is programmed again. The result of a SEU that causes the devices to stop operating properly is generally defined as a Single Event Functional Interrupt (SEFI). One possible solution to this problem is to use radiationhardened FPGAs, but since these devices are very expensive, alternative solutions allowing using non radiation hardened devices are currently investigated. Triple Module Redundancy (TMR) is often exploited for hardening digital logic against SEUs in safety-critical applications. As an instance, TMR is often exploited to design fault-tolerant memory elements to be employed in sequential digital logic. Unfortunately, non-radiationhardened FPGAs present insufficient protection of memory elements in both the mapped circuit, and the configuration memory. As a result, particles hitting the configuration memory can change dramatically the logic functionally of the mapped circuit, as well as the circuits memory elements. Techniques are therefore required to evaluate the impact of SEUs affecting FPGAs configuration memory, and to avoid undesired changes of the circuit mapped on the FPGA. Besides the investigations on the effects induced by SEUs, several hardening techniques have been proposed in the past years in order to avoid the incidence of SEUs on the behavior of the implemented circuits. Some of them aim at correcting the effects of SEUs in the device configuration memory. For example the techniques known as Scrubbing consists in periodically reloading the whole content of the configuration memory. A more complex system used to correct the information in the configuration memory exploits the readback and partial configuration process. Through the readback operation, the content of the FPGAs configuration memory is read and compared with the expected value, which is stored in a dedicated memory located outside the FPGA. As soon as a mismatch is found, the correct information is downloaded in the FPGAs memory. During re-configuration only the faulty portion of the configuration memory is rewritten. Alternative techniques were also proposed that do not aim at identifying and correcting the modification introduced by SEUs, but just aim at avoiding the propagation of SEU effects to the observable outputs, mainly by introducing hardware redundancy in the circuit mapped on the FPGAs such in the case of Triple Module Redundancy (TMR). The basic idea of the TMR scheme is that a circuit can be hardened against SEUs by designing three copies of the same circuit and building a majority voter on the outputs of the

4/4

Research Proposal

replicated circuits. Implementing triple redundant circuits to prevent the effects of SEUs in other technologies, such as ASICs, is generally limited to protecting only the memory elements, because combinational logic is hard-wired and corresponds to nonconfigurable gates. Conversely, full module redundancy is required in FPGAs, because memory elements, interconnections, and combinational gates are all susceptible to SEUs. This means that three full copies of the users design have to be implemented to harden the circuit against SEUs. In order to prevent fault accumulation, TMR is often coupled with techniques like scrubbing or readback and partial reconfiguration to remove SEUs from the FPGAs configuration memory. Although effective, the overheads TMR mandate may overcome the available resources, i.e., the number of available I/O pads, and thus some applications exist where it can be hardly exploited. To solve this problem a new method was proposed in [13], aiming at reducing the overhead of a full TMR implementation. Even if optimized, these kinds of methods come with very high design penalties; besides the area overhead due to the TMR design, removing SEUs from the configuration memory mandates the adoption of an ad-hoc circuit for supporting the readback and partial reconfiguration procedures, and additional energy consumption. The optimal implementation of the TMR circuitry inside SRAM-based FPGAs depends on the type of circuit that is mapped on the FPGA device. As illustrated in [3], the logic may be grouped into four different types of structure: throughput logic, state-machine logic, I/O logic, and special features (embedded RAM modules, DLLs, etc.). The result of several radiation campaigns performed to understand the effects radiations induced faults have on the behavior of circuits implemented in SRAM-based FPGAs were reported in many research papers. It was observed that, in general, radiation leads on the correct functionality of the circuits, an effect defined as a Single Event Functional Interrupt (SEFI). Several fault injection approaches, alternative to always expensive radiation campaigns can also be found in literature. A comparatively cheaper alternative is the use of electromagnetic interferences to conduct contactless fault injection. These are common disturbances in automotive vehicles, trains, airplanes, or industrial plants. Such a technique is widely used to stress the digital equipment. Thanks to the use of commercial burst generators, this technique is easy to implement. A different approach is the use of emulation techniques. Bit-flips are injected by direct manipulation of configuration memory bitstream of the FPGA, either through changes in the original configuration bitstream or at run-time through dynamic reconfiguration. The greatest advantage of emulation method is the higher controllability of the experiments, in contrast to the unpredictability of radiation or electromagnetic interference fault injection, which enables a better diagnostic of the effects of each SEU. 3. Examples of Future Application Areas

Multi-carrier spread spectrum concepts have been developed for a wide variety of applications. 4. Cellular mobile radio DVB-T return link MMDS/LMDS (FWA) Aeronautical communications

Research Goals

The aim of this research is to define a set of rules for a new framework for implementing highly critical real-time integrated systems based on dynamically reconfiguration FPGAs. The aim will be to make these systems immune to faults emerging from memory bit-flips, by confining,

5/4

Research Proposal

detecting, locating and mitigating them. This work is part of broader project, aiming the design of FPGA-based self repairing circuits.The proposed framework is built around a customized triple Modular Redundancy implementation associated to a fault detection-and-fix controller. This controller would be responsible for: i. Detecting data incoherencies ii. Locating the faulty redundant module and iii. Restoring the original module configuration, fixing it without affecting the normal operating of the functional logic.

5.

Software

To simulate the results of research, I will use software MATLAB version 6.0 or later version, may be using some other software up to some extent also. MATLAB is complete for matrix laboratory that is very popular and powerful tool mostly used in scientific research problems, it has both the environments, i-e: command line and GUI also there are some built-in tools like CDMA tool, control tool, communication tool etc that makes easier to simulate any engineering research problem. 6. Conclusions

The high spectral efficiency and the low receiver complexity of MC-CDMA makes it a
good candidate for the downlink of a cellular system.

The low peak to average power ratio (PAPR) property of MC-DS-CDMA makes it more
appropriate for the uplink of a multiuser system. 7. References

[1] N. Yee J., P. Linnartz. And G. Fettweis, Multicarrier CDMA in indoor wireless radio networks, in Proc. of IEEE PIMRC93, Sept. 1993, pp-468-472. [2] Li Z. and Matti L. A., MMSE Based Receiver Design for MC-CDMA Systems, in Proc. of 14th Intnl. Symp. on PIMRC, pp-2640-2644. [3] K. Fazel and S. Kaiser (Eds), Multi-carrier-Spread Spectrum and related topics. Boston: Kluwer Academic Publications, 2002.

6/4

S-ar putea să vă placă și