I) Energy-Performance Optimization of the Uncore in Multi-cores

no_image After the multi-core revolution, the continuous evolution of market applications and technological devices have posed new challenges to hardware manufacturers, in order to meet ever-increasing low-power and performance requirements. In this scenario, the focus of the design architects shifted from the computational logic to the uncore subsystems, namely the interconnect and the cache hierarchy, to sustain the required data communication. Moreover, the uncore logic strongly affects the system-wide performance other than being the primary source of power consumption in the chip. This branch of my research mainly focuses on the Networks-on-Chip (NoCs) as the standard on-chip interconnection fabric in multi-cores also addressing the design of novel cache coherence protocols to support the physically distributed, logically shared cache hierarchy. Both power gating and Dynamic Voltage and Frequency Scaling (DVFS) methodologies for NoCs have been explored to optimize the energy-performance trade-off. Conversely, several optimized cache coherence protocols have been explored to support the Dynamic Non Uniform Cache Access (DNUCA) architecture as well as the possibility to power gating selected L2 cache banks during CPU intensive periods of the application execution. This research sits on the novel low-power design perspective of acting on cache resources rather than on computational logic to save energy. This part of my research strongly relies on ad-hoc cycle-accurate simulation tools that integrates the functional and power models on both the multi-core parts side by side with the modeled DVFS and power-gating actuators, thus imposing a balanced effort between the evaluation of new architectural solutions and the need to deeply customize state of the art research simulators to accurately model the additional considered components, e.g., DVFS and power gating actuators.


II) Designing Secure Computer Architectures in the IoT and Beyond

no_image The security of modern cryptographic schemes relies on the secret key, rather than on the restricted knowledge of the encryption algorithm that is supposed to be known to the attacker (Kerckhoffs’ principle). A generic encryption algorithm takes a secret key and a plaintext as inputs and outputs the chipertext. Conversely, the decryption algorithm takes the secret key and the chipertext as inputs to output the plaintext. Traditionally, the cryptanalysis is used to mathematically prove that an encryption algorithm does not contain weaknesses that allow to retrieve the plaintext from the chipertext without knowing the secret key. The Internet of Things (IoT) revolution highlights a tightly connected digital world and pushes to the limit the development of these cryptographic algorithms in low-cost hardware, i.e., smart-cards, microcontrollers as well as smartphones and tablets. In this scenario, the traditional cryptanalysis techniques to secure the so called main channel of information, i.e. to prove the harness of the encryption algorithm against the most powerful computing attacker, is not enough any more. In fact, the IoT makes extremely easy for the attacker to gain physical access to these low-cost devices and taking measurements on their physical variables, i.e., power consumption. The values of these physical variables are known as side-channel information, since they do not directly allow to retrieve the secret key, while they strongly depend on how the target device implements the cryptographic scheme as well as on the processed data, i.e., plaintext and secret key. In particular, the Side-Channel Attacks (SCAs) emerges as a family of cryptographic attacks that exploit the side-channel information extracted from a device that is running the target cryptographic scheme to retrieve the secret key. The SCAs open a new security dimension that intimately depends on the implementation of the cryptosystem, and for which any cryptanalysis proof is of no help. Although several, mostly software-based countermeasures to SCAs have been proposed in the open literature, the hardware design side of this research area is mostly unexplored due to the high degree of cross-disciplinary skills involved, i.e., hardware design architects, low-level software engineers and cryptographers have to coexists within the same research team. However, a fresh, security-aware, hardware design methodology is of paramount importance. Conversely, with no security-related guarantees from the platform level even the protected software implementations are still vulnerable to SCAs. This branch of my research is supported by the strong collaboration with Prof. Alessandro Barenghi and Prof. Gerardo Pelosi from the cryptoanalysis research group at Politecnico di Milano. In particular, the research is mainly focused on the hardware analysis and the design of countermeasures for both embedded CPUs and cryptographic hardware accelerators against Differential Power Analysis (DPA) and Template-based side-channel attacks. The critical novelty of the research sits on the analysis of the hardware device at the gate-level to extract the information leakage model that is later used to develop the coutermeasure. A sensible output of the research is the possibility, for the first time to the best of my knownledge, to exactly pinpoint the information leakage sources from an embedded CPU, thus consequently spin over the hardware design to secure it against DPA-based attacks.


III) Design and Verification of Power Efficient Embedded Multi-Cores and Gate-Level Tools

no_image The gate-level simulation of computing devices enables accurate power, area and timing estimates that can be also back annotated in the DSE flow to increase the architectural design space exploration accuracy. Moreover, the hardware prototype can be leverage to support cross-disciplinary research, e.g., the design of security-aware computer architectures. My research in this area is focused on the design of low-power, cache coherent embedded multi-cores starting from open source embedded CPUs, with particular emphasis on the synergic interaction between the interconnect and cache hierarchy. The hardware design and verification of novel NoC architectures represents an important tile of this research, while the critical objective is to deliver an open source, cache-coherent embedded multi-core for education and to enable further research. At this stage, the research outcome is a dual-core architecture that implements a simple Valid-Invalid (VI), snooping-based coherence protocol. The architecture is scheduled to be exploited in the analysis of the side-channel vulnerabilities from the security viewpoint. Starting from the hardware design a set of companion tools for gate-level verification and analysis have been developed. In particular, a power analysis toolchain for gate-level netlist has been developed for both the ASIC and the FPGA hardware design flows. It allows to accurately estimate the power consumed by any synthesized architecture with a configurable granularity in the order of tenths of nanoseconds. The tool has been used to support the SCA vulnerability analysis in and is currently used to develop a performance-counter power model for inorder embedded CPUs within the EU MANGO project.

On Going Master Degree Thesis

This section reports the ongoing thesis that I'm directly supervising with prof William Fornaciari. Each work addresses a specific issue in the computer architecture research filed. In particular, all of the proposed projects tackle the specific research considering both software simulation and hardware implementation. Thus, a complete exploration considering area, power, performance and timing metrics is possible. For each thesis a short title and the MS student name are reported.

  • Adaptive Buffer Management Scheme for NoCs - Andrea Marchese
  • Smart Power Gating at Buffer Level for Power-Performance optimization in NoCs - Andrea Canidio
  • Intelligent End-to-End compression in NoCs - Fabio Pancot
  • Designing a fully synthesizable Flexible Network Interface Controller (NIC) for real NoCs - Luca Borghese

Prospective Students

We are always looking for determined and enthusiastic student that are willing to join prof Fornaciari's research group to explore the computer architecture field. We do our best to always guarantee few highly valuable MS thesis available. The topics range from software simulations of multi-cores to RTL design focusing on interconnect and memory hierarchy aspects in multi-cores.
Interested students can directly email me or prof Fornaciari to get a complete overview on the actually available research topics. Moreover, you are welcome to propose your own research project that can be evaluated and refined to fit the Department MS thesis requirements.

© 2017 Davide Zoni