Carnegie Mellon University
Browse

Radiation Hardening of Field Programmable Gate Arrays

Download (40.46 MB)
thesis
posted on 2022-03-01, 22:07 authored by Ogun KibarOgun Kibar
Field programmable gate arrays (FPGAs) are programmable integrated circuits widely deployed across a range of applications such as aerospace, automotive, medical, data centers, and highperformance computing due to their compelling performance, energy and cost efficiency, and
flexibility. Most FPGAs are reprogrammable, enabling designers to program FPGAs to handle many different tasks, fix bugs, and update applications post-deployment. Despite their benefits, reprogrammable FPGAs are susceptible to radiation effects similar to any other integrated circuit,
even more so because of their reprogrammable nature. Reprogrammable FPGAs typically store their configuration in SRAM-based static memory cells that are vulnerable to single event upsets (SEUs), bit flips that can occur when high-energy particles strike the semiconductor and
potentially break functionality of the application implemented on the FPGA. Circuit elements have been becoming denser with process scaling, which has led single particle
strikes to encompass a larger number of elements and multiple bit upsets to become common. This requires the traditional countermeasures used to tackle SEUs in FPGAs to become more complex and costly or involve high latencies. In light of these trends, we follow an unorthodox
approach and take advantage of larger number of elements being affected by single particle strikes. To achieve this, we introduce a new countermeasure and distribute tunable SEU sensors in FPGA configuration memory that detect SEUs in nearby configuration cells with a low latency. We expect the sensors to benefit from scaling, since decreased node spacing in advanced process nodes will likely lead to better correlation between configuration cells and sensors.
We deploy the SEU sensors in a proof-of-concept radiation-hardened FPGA consisting of an on-chip error handler that receives localized error flags from sensors and works with an on-chip MRAM-based non-volatile memory to correct errors. The error handler also communicates with
the fabric through dedicated fabric I/Os, allowing applications to request reconfiguration from the error handler, which can be used as a higher latency safety net to sensors. To demonstrate our design on silicon, we taped out a test chip in an industrial 22nm FinFET process with estimated area, power, and delay overheads of 23%, 5.8%, and 4.3%, respectively. Finally, we conducted heavy ion and neutron tests on our test chip to evaluate our radiation hardening features. Results show that SEU sensors can detect more than 90% of errors in configuration cells, enabling them to prevent failures at the system level with 100x lower latency and 300x smaller system down time at high LETs compared to our safety net.

History

Date

2021-07-15

Degree Type

  • Dissertation

Department

  • Electrical and Computer Engineering

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Kenneth Mai

Usage metrics

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC