CounterMeasures

Security and trust are becoming increasingly important as microelectronic systems continue to find additional applications in industry, defense, aerospace, data centers, supervisory control and data acquisition (SCADA) environments and health care devices. The unsupervised nature of many newer devices and applications, in combination with internet connectivity, makes these system more vulnerable to attacks and subversion.

IoT devices maintain the integrity and authenticity of its data and the privacy in its communication by leveraging cryptographic algorithms as primitives within authentication and encryption protocols. Nearly all cryptographic algorithms are based on secret keys. Algorithms such as the Advanced Encryption Standard (AES) are engineered to make the task of key recovery using cryptoanalysis or brute force attacks nearly impossible in reasonable time frames. The cryptographic strength of an algorithm assumes that an attacker can only control inputs and observe encrypted (ciphertext) outputs. Although it is possible to maintain this assertion in many cryptographic hardware implementations, the proliferation of IoT devices to unsupervised, remote environments has widened the attack surface by enabling adversaries to more easily gain physical access to a device.

All physical devices leak information about the operations they are performing and the data they are processing. The term “side-channel” has been coined to represent sources of information that leak from physical devices through channels other than the digital inputs and outputs specified by the algorithm. In many cases, these side-channel signals provide high-resolution temporal and spatial information that reflect within-device logic operations as the device executes the algorithm. Unfortunately, these side-channel-enabled information leakage channels are not considered in the design of most cryptograhic algorithms. In fact, algorithm designers assume internal data (intermediates) generated during execution is unobservable, much like the cryptographic key. Knowledge of intermediates provides a means for an adversary to reverse-engineer the key in a substantially shorter interval of time compared to cryptoanalytic or brute force attacks. Moreover, relatively inexpensive benchtop laboratory equipment can be used to measure side-channel signals.

The most direct source of leakage is from the power supply, .i.e., the static and dynamic power consumption of the device. However, localized electromagnetic (EM) emanations, acoustic signatures, and temperature profiles can also be measured to provide additional spatial and temporal dimensions to the operational activity occurring within the device. The process of measuring and analyzing side-channel signals represents a passive attack because no changes occur to the operational behavior of the device, and the measurement process is nearly impossible to detect within the device.

IC-Safety’s countermeasure to side-channel attacks is called Side-channel Power analysis Resistance for Encryption Algorithms using Dynamic partial reconfiguration (SPREAD). Here, countermeasures refer to modifications in the cryptographic implementation or changes to the operational behavior of a device that obscure the relationship between side-channel signals and the actual operations carried out on internal device data. SPREAD is implemented on field programmable gate arrays (FPGAs) or eFPGAs that are capable of dynamic partial reconfiguration (DPR). DPR refers to the process of modifying a portion of the FPGA’s implementation in real time, while the remaining components of the device run at full speed.

The DPR operation carried out by SPREAD removes a fundamental assumption on which the success of side-channel attacks (SCA) depends: the underlying hardware configuration of the device remains invariant over time. The periodic reconfiguration of the AES engine increases the difficulty of CPA attacks by making it more difficult to isolate the power consumption associated with specific operations on specific bytes of the secret key. CPA requires the acquisition and averaging of multiple power transient waveforms as a means of reducing measurement noise that is not related to the target key byte. Since the process of collecting multiple power transient waveforms takes a finite amount of time, and the success of CPA in deducing the correct key bytes increases with the number of power trace waveforms collected, we leverage this limitation of CPA (and other SCAs) by rapidly changing the implementation characteristics of the SBOXs in the AES engine data path.

The SPREAD engine is implemented as a state machine running in the programmable logic with AES. SPREAD self-reconfigures the SBOX components located in the AES data path, while simultaneously enabling encryption or decryption operations to proceed at full speed and in parallel. This is accomplished by introducing two redundant copies of the SBOX into the architectural design. The SPREAD engine applies control signals to a set of MUXs that re-wire the data path of the AES engine to remove one of the redundant SBOXs, and then reprograms it with a partial bitstream that contains an AES SBOX that is functionally equivalent, but contains different logic and routing structures. The SBOX is then re-inserted into the AES data path and another SBOX is randomly selected as the redundant copy for reprogramming. The second redundant copy enables higher frequency shifting of different SBOX implementations into adjacent positions in the data path, i.e., before the start of each plaintext encryption. Self-reconfiguration can be carried out quickly using the proposed MUXing scheme, eliminating the need to stall the encryption engine per mutation.

SPREAD’s strategy to utilize a moving target architecture can be classified as a hybrid of “noise enhancing” and “signal reducing” countermeasures. The noise introduced into the power traces is best characterized as information from uncorrelated signals. In SPREAD, the power signal associated with a key-byte/SBOX combination varies across a set of diversified SBOX implementations. Therefore, for each plaintext encryption operation, both the plaintext and SBOX hardware configuration can change and only the key bytes remain constant. Since the key bytes themselves do not leak information into the power transients on their own, and the leakage occurs only when they propagate along combinatorial logic paths, the time-varying structural changes in the different SBOX logic paths diffuse the power signal contributions from each key byte in the integrated power transient waveform.

SPREAD Technology Overview

SBOX Reconfiguration Regions

A block diagram of the proposed moving target architecture is shown in the following figure, where we show only the SBOX portion of the 128-bit version of the AES algorithm. Any of the 16 SBOX regions can be dynamically reprogrammed on the fly by the DPR engine. To enable DPR and encryption operations to proceed in parallel, we add two additional redundant SBOXs and a MUXing structure to the architecture. The MUXs allow one of the redundant SBOXs to be shifted to the left or right as a means of shuffling data processing to neighboring SBOX locations. The second redundant SBOX is used as the target region of a DPR operation. The location of the redundant SBOXs is determined by the DPR engine which generates the MUX control signals, “MUX_ctrl”. The configuration shown in the figure illustrates a case where SBOX_14 and SBOX_0 have been dynamically MUXed out of the data path.

The remaining functional components of the AES engine, including Mix-Columns, Shift-Rows and Add-Round, are connected combinatorially to the outputs of the SBOXs and are included in the top level static design. Although it is possible to create reconfiguration regions and a MUX’ing structure for these other functional components, i.e., the four Mix-Columns or the Add-Round components, the more complex delay diversity introduced along paths within the diversified SBOXs already propagates into these components, effectively decorrelating any side-channel signals that would be leveraged by CPA.

Relocatable Partial Bitstreams

It is important to note that the standard Vivado tool flow does not support relocatable partial bitstreams. Instead, a separate partial bitstream is required for each SBOX reconfiguration region, which increases the storage requirements, e.g., 18 $\times$ 18 = 324 partial bitstreams (assuming a separate diversified SBOX is created for each region). As an alternative, Vivado tcl commands are utilized to allow any of the diversified SBOX implementations to be programmed into any of the 18 reconfiguration regions. This reduces the storage requirements to only 18 partial bitstreams. In fact, in one possible, highly compact, implementation of the SPREAD engine, the set of 18 diversified SBOXs are instantiated into the DPR regions of the bitstream used at boot time, and only a small BRAM buffer is needed to enable the SPREAD engine to perform DPR operations. Here, the BRAM buffer is used as temporary storage to allow partial bitstreams to be shuffled between reconfiguration regions.

SPREAD Architecture

A block diagram of the SPREAD engine utilized in our experiments is shown below. The system consists of an AES engine with 18 SBOX reconfiguration regions, an ICAP controller, an on-chip block memory (BRAM) storing the 18 partial bitstreams, and a master controller state machine. The ICAP controller handles self-reconfiguration of the FPGA by reading partial bitstreams from the BRAM, adjusting the frame address and then writing the bitstream data into ICAP. The master controller synchronizes the ICAP operations and configures the SBOX control signals. The partial bitstreams are stored within a BRAM memory of the SPREAD bitstream using Vivado’s coefficient file (COE) functionality.