My research is primarily focused on the areas of Computer Architectures and Electronic Design Automation, with particular emphasis on the following research topics:

  • Power-Aware Computing for Embedded Architectures
    Power dissipation has been identified as the main performance limiter for high-performance processors, while power dissipation limits the capabilities of battery-powered embedded mobile devices and wireless sensor nodes. The main goal of the research is to develop design techniques for the phases of analysis and synthesis of digital circuits and systems to reduce the power consumption at the higher levels of abstractions. The research activity is mainly focused on estimation and optimization of power dissipation. The proposed power estimation and optimization techniques are mainly addressed to embedded computing architectures based on VLIW (Very Long Instruction Word) pipelined processors. The proposed techniques have been applied to the Lx/ST200 family of VLIW embedded processor cores (developed as a partnership between HP Labs and STMicroelectronics). The ST200 family (including the ST210, ST220, ST231 processor cores) is used today for embedded media processing in a variety of audio, video and imaging consumer products.

  • Design Space Exploration of Embedded Architectures
    Given the complexity of multi/many core architectures, system optimization and exploration definitely represent challenging research tasks. A wide range of design parameters must be tuned from a multi-objective perspective, mainly in terms of performance and energy consumption, to find the most suitable system configuration for the target application. Multi-objective exploration of the huge design space of multi/many core architectures definitely needs for automatic Design Space Exploration (DSE) techniques to systematically explore the design choices and to compare them in terms of multiple competing objectives (trade-offs analysis). The aim of my research was to investigate on power/performance trade-offs in application-specific embedded architectures. The exploration techniques are based on multi-objective optimization algorithms and energy/delay estimation metrics. The main goal is to provide an automatic DSE methodology and tool for the analysis of system characteristics and the selection of the most appropriate architectural solution to satisfy power/performance system requirements. A set of heuristic optimization algorithms have been defined to prune the design space to be explored, while a set of response surface modeling techniques have been defined to further speed up the exploration time. The basic idea was defining an analytical response model of the system behavior based on a subset of system simulations to predict the unknown system response. The proposed DSE framework (namely MULTICUBE Explorer) leverages a set of open-source tools for the exploration, modeling and simulation to guarantee a wide exploitation of the project results in the embedded system design community. Research on DSE techniques have mainly been carried out in the context of the FP7-MULTICUBE European project under my scientific coordination.

  • Adaptive Design and Monitoring of Applications for Many-core Architectures
    Based on the knowledge on design-time multi-objective exploration, my research path evolved towards the definition of a run-time methodology to optimize the allocation and scheduling of different application tasks on the underlying resources of the target many-core architecture. In 2007, to address this problem, a two-step approach has been devised in collaboration with IMEC and ALaRI Research Institute. First, the design exploration flow generates a Pareto-optimal set of design alternatives in terms of power/performance trade-offs. Second, this set of operating points can be used at run-time to decide how system resources should be allocated to different tasks running on the many-core system. Research on run-time resource management techniques have mainly been carried out in the context of the FP7-2PARMA European project under my scientific coordination. During 2PARMA project, the proposed run-time methodology has been applied to several application domains to demonstrate their applicability and benefits in industrial contexts, such as the P2012/STHORM many-core platform provided by STMicroelectronics.

    Dynamic runtime systems still provide many opportunities for energy savings. Some new research paths have recently been opened. Research is on-going on approximate and adaptive computing for power-aware embedded systems. Since optimizing computing systems for all worst case scenarios is not a feasible approach, we have to start thinking at systems that are capable to self-adapt dynamically to changing operating conditions and environments. The lack of self-tuning and run-time adaptation capabilities at the application-level tends to led to sub-optimal power/performance trade-offs at the system level given by the underutilisation of the resources in a many-core architecture. The proposed approach borrows some concepts derived from the approximate and adaptive computing area to give to the application some self-tuning capabilities. The system-wide adaptive approach is based on global monitoring, adaptation and optimization. Appropriate self-adaptive techniques are provided to dynamically support migration of code and data among cores. In the context of heterogeneous many-core architectures, we can devise support for code and data migration between different types of processing cores, thus capable to adapt to several processor architectures and hardware accelerators. On-going joint research with the Technical University of Delft (NL) is focused on run-time optimization techniques of dynamically reconfigurable systems. I believe there are still significant paths dealing with heterogeneity in the era of many-cores based on accelerator-centric architectures.

    The research experience gained on adaptive computing for heterogeneous many-cores lead me towards one of my recently-opened research paths: The application autotuning combined with the runtime power management of High Performance Computing systems. This research path represents the core concept I have proposed as Scientific Coordinator in the H2020 ANTAREX research project, accepted for funding under the FET-HPC European initiative in 2015. The main goal of the project is to provide a breakthrough approach to express by a Domain Specific Language the application self-adaptivity and to runtime manage and autotune applications for green and heterogeneous HPC systems. I believe the ANTAREX project represents a very promising and challenging research path up to the Exascale era, expected to be reached in 2023.
  • Many-core Architectures based on Networks-on-Chip
    Given the increasing complexity of many-core architectures, the current trends on on-chip communication are converging towards the Network-on-Chip (NoC) approach, representing a high bandwidth and low energy solution. Using the NoC-based design approach has several other advantages, such as scalability, reliability, IP reusability and separation of IP design from on-chip communication design and interfacing. NoC design represents a new paradigm to design multi/many-cores shifting the design methodologies from computation-based to communication-based. To address these NoC research challenges, a new path of my research started in 2003 focusing on the topic of low-power NoC for embedded architectures, covering energy aware design and techniques from several perspectives and abstraction levels. My research on NoC started focusing on the development of PIRATE, a modular and flexible framework for power/performance exploration of Network-on-Chip architectures. The PIRATE framework has then been applied to explore distributed shared memory architectures based on NoC. For this class of systems, I have also started a new research path to investigate the problem of synchronization mechanisms and memory management techniques. Most of my research on low-power NoC for embedded architectures have been carried out in the context of two industrial research projects funded by STMicroelectronics. I was Principal Investigator in both research contracts, namely: “Low Power Network on Chip and Embedded Architectures”(2003-2005) and "Low Power Network on Chip and Multiprocessor Platforms" (2006-2008). Then my research was addressing the problem of the application mapping, optimization and topology customization for the Spidergon NoC architecture provided by STMicroelectronics (Grenoble, F). My research on ST Spidergon NoC has been carried out in the context of the MEDEA+ LoMoSA+ project.

    In 2007, I have started investigating about security aspects in NoCs. Our work appeared at CODES-ISSS 2007 [C41] still remains to this day one of the first approaches to investigate an important but unaddressed aspects of NoCs, namely security. The work has then been extended in IEEE Trans. on Computers 2008 [J9]. Afterwards, we opened a new research path by adding high level services on top of the standard communication services usually provided by an interconnection network. Research on run-time monitoring through NoC are still on-going under my scientific coordination in collaboration with ALaRI Research Institute based at the University of Lugano (CH). Moreover, on-going joint research with the Universidade Federal do Rio Grande do Sul in Porto Alegre (Brasil) is focused on floor planning-aware exploration for application-specific NoC and adaptive buffers for virtual channel routers in NoCs.

  • Technology-aware Many-core Architectures Design
    A basic tenet of my research is that power-aware application-specific many-core architectures must be tuned being aware of microarchitectural and technology problems and emphasizing the early-phase of design space exploration. Many challenging and relevant research topics related to many-core architectures are still open. There is still significant research to be done, and architectural and technology challenges will increase in importance as we scale down the fabrication process in the nanoscale era. I believe that technology-driven considerations (such as ultra-low-power design, resiliency and process variability) will further increase their importance on architecture and system level design in the next coming years. In the many-core era, system design optimization and exploration still represent challenging tasks. Networks-on-Chip, as an architectural solution for scalable high speed interconnect, and power-aware design will continue to be crucial topics, since power and energy issues still represent one of the limiting factors in integrating multi/many cores on a single chip. The power-wall problem and its dual utilization-wall problem are considered among the main barriers for an efficient performance scaling bringing to the dark silicon problem (where for dark silicon is intended the chip fraction not usable in a many-core chip due to the power budget). To address the dark silicon problem, Near Threshold Computing (NTC) has recently been proposed as a promising solution to mitigate the dark silicon effects by operating at lower frequency but exploiting a larger number of cores. However, NTC suffers from an increase sensitivity to technology problems (such as process variability). In this context, I have recently opened a new research path in collaboration with National Technical University of Athens. The main focus of our research is on variability-aware NTC architectures based on the formation of voltage islands with emphasis on voltage and workload allocation across the many-core architecture. We also introduced the usage of approximate computing concepts to sustain performance despite of variability effects

RESEARCH PROJECTS and InTernational Collaborations

My research activities have been carried out in collaboration with several international universities, research centers and industries (about 100 out of my 160 scientific publications include co-authors with different affiliations, 40 out of them with industrial co-authors, 120 co-authors overall). My research has been funded by several national and EU projects selected based on a competitive process. Since 2003 I was co-applicant and active participant of 8 European and 2 industrially funded projects (attracting around 4 M€ funding for POLIMI). Among them, I was Project Coordinator of three European projects:

  • Project Coordinator of the European project H2020-FET-HPC-ANTAREX-671623  on "AutoTuning and Adaptivity appRoach for Energy efficient eXascale HPC systems" (Sep. 2015 - Aug. 2018). The project wins a 3 million euro grant in the H2020 Future and Emerging Technologies programme on High Performance Computing. The project involves leading academic and industrial partners as well as CINECA, the Italian Tier-0 supercomputing centre and IT4i, the Czech Tier-1 supercomputing center. Being one of the nineteen research projects in FET-HPC-2014, ANTAREX brings the partners on the forefront of the European research in High Performance Computing. The main goal of the ANTAREX project is to provide a breakthrough approach to express by a Domain Specific Language the application self-adaptivity and to runtime manage and autotune applications for green and heterogeneous High Performance Computing systems up to the Exascale level. The Consortium also includes three top-ranked academic partners (ETH Zurich, University of Porto, INRIA). Industrial partners include one of the leading biopharmaceutical companies in Europe (Dompé) and the top European navigation software company (Sygic).

  • Project Coordinator of the European project FP7-2PARMA-248716 on "PARallel PAradigms and Run-time MAnagement techniques for Many-core Architectures" (Jan. 2010 - Mar. 2013). EC Contribution to the project: 2.74 Mio Euro. The 2PARMA Consortium is composed of seven partners: Politecnico di Milano (Italy), STMicroelectronics (Italy and France), Heinrich Hertz Institute - Fraunhofer Institute for Telecommunications (Germany), IMEC (Belgium), ICCS - Institute of  Communication and Computer Systems  (Greece), RWTH Aachen University (Germany), Synopsys (Belgium). The 2PARMA project focused on the definition of a parallel programming models, run-time resource management policies and design space exploration methodologies for many-core architectures. Applicability and benefits of the proposed techniques and tools have been validated and assessed on as set of industrial applications and hardware platforms. The 2PARMA project ended in March 2013 with an excellent evaluation by the European Commission demonstrating the fulfillment of its objectives and scientific/technical goals. Based on the opinion of the European expert reviewers: “the project represents a success story and has made significant contributions to the state-of-the-art in the field”. Also the expected impact is considered to be excellent.

  • Project Coordinator of the European project FP7-MULTICUBE-216693 on "Multi-objective design space exploration of multi-processor SoC architectures for embedded multimedia applications" (Jan. 2008 - June 2010). EC Contribution to the project: 2.098 Mio Euro. The MULTICUBE Consortium was composed of nine partners: Politecnico di Milano (Italy), Design of Systems on Silicon – DS2 (Spain), STMicroelectronics (Italy), IMEC (Belgium), ESTECO (Italy), University of Lugano - ALaRI (Switzerland), University of Cantabria (Spain), STMicroelectronics Beijing (China), Institute of Computing Technology – Chinese Academy of Sciences (China). The MULTICUBE project finished in 2010 with an excellent evaluation by the EC demonstrating to fully achieve its objectives and scientific/technical goals. The project has demonstrated the benefits of using automated design exploration techniques (based on enhanced multi-objective optimization algorithms) by implementing a number of industrial use cases; practical ways to trade off accuracy for speed through the use of multi-abstraction level simulation thus enabling exploration of larger design spaces; the feasibility of automated parameter tuning at run-time using exploration data collected at design-time.  In the context of the MULTICUBE project, I was also leading a research group at Politecnico di Milano whose research focuses on design space exploration for multi-processor architectures working on an open-source tool (MULTICUBE Explorer) to enable an automatic and fast optimization of configurable system architectures towards a set of objective functions such as energy and delay. MULTICUBE Explorer provides a set of innovative sampling and optimization techniques to help finding the multi-objective Pareto points. It also provides an open XML interface for supporting exploration of new platforms/architectures by interacting with a system-level simulator.

Since 1996 I have started a continuous research collaboration with STMicroelectronics and I was Principal Investigator of two industrial research projects funded by STMicroelectronics (2003-2008):

  • Principal Investigator in the two-year research project: "Low Power Network on Chip and Multiprocessor Platforms" (2006-2008) between DEI, Politecnico di Milano and Advanced System Technology Division of STMicroelectronics Agrate B.

  • Principal Investigator in the two-year research project: “Low Power Network on Chip and Embedded Architectures”(2003-2005) between DEI, Politecnico di Milano and Advanced System Technology Division of STMicroelectronics Agrate B.

RESEARCH COLLABORATORS/CO-AUTHORS/CO-EDITORS (with other affiliations) (about 70):

Gerd Ascheid (Aachen University, Germany), Prabhat Avasare (IMEC, Belgium), Andrea Bartolini (ETHZ, Switzerland), Sanzio Bassini (CINECA, Italy), Alex Bartzas (EXUS, Greece), Andrea Beccari (Dompé, Italia), Luca Benini (ETHZ, Switzerland), Mladen Berekovic (TU Braunschweig, Germany), Koen Bertels (Delft Technical University, The Nederlands), Sara Bocchio (STMicroelectronics, Italy), Umberto Bondi (University of Lugano, Switzerland), Jens Brandenburg (Fraunhofer HHI, Germany), João Cardoso, (University of Porto, Portugal), Luigi Carro (UFRGS, Brazil), Jeronimo Castrillon (Dresden Technical University, Germany), Franky Catthoor (IMEC, Belgium), John Cavazos (University of Delaware, USA), Carlo Cavazzoni (CINECA, Italy), Peter Cheung (Imperial College London, UK), Radim Cmar (Sygic, Slovakia), Caroline Concatto (UERGS, Brazil), Marcello Coppola (STMicroelectronics, France), Giovanni De Micheli (EPFL, Switzerland), Giuseppe Desoli (STMicroelectronics, Italy), Giovanni Erbacci (CINECA, Italy), Dongrui Fan (ICT - Chinese Academy of Sciences, China), Leandro Fiorin (IBM-ASTRON, The Nederlands), Franco Fummi (University of Verona, Italy), Arpad Gellert (Sibiu University, Romania), Carlo Guardiani (STMicroelectronics, Italy), Paolo Gubian (University of Brescia, Italy), Zhang Hao (ICT - Chinese Academy of Sciences, China), Michael Huebner (Ruhr Universitaat Bochum, Germany), Carlos Kavka (ESTECO, Italy), Torsten Kempf (Cognex Corporation, Germany), Fadi Kurdahi (University of California at Irvine, USA), Marcello Lajolo (NEC Labs, USA), Rainer Leupers (Aachen University, Germany), Wayne Luk (Imperial College London, UK), Enrico Macii (Politecnico di Torino, Italy), Giovanni Mariani (IBM-ASTRON, The Nederlands), Marco Martinez (DS2, Spain), Jan Martinovic (IT4Innovations, Czech Republic), Debora Matos (UERGS, Brazil), Diego Melpignano (STMicroelectronics, Italy), Smail Niar (University of Valenciennes, France), Luca Onesti (ESTECO, Italy), Martin Palkovic (IT4Innovations, Czech Republic), Pierre Paulin (Synopsys, Canada), Hector Posadas (University of Cantabria, Spain), Erven Rohou (INRIA, France), Ingo Sander (KTH, Sweden), Nico Sanna (CINECA, Italy), Michael Schulte (AMD, USA), Vlad-Mihai Sima (Bluebee, The Nederlands), Katerina Slaninova (IT4Innovations, Czech Republic), Dimitrios Soudris (National Technical University of Athens, Greece), Benno Stabernack (Fraunhofer HHI, Germany), Giulio Urlini (STMicroelectronics, Italy), Geert Vanmeerbeeck (IMEC, Belgium), Eugenio Villar (University of Cantabria, Spain), Lucian Vintan (Sibiu University, Romania), Vit Vondrak (IT4Innovations, Czech Republic), Marise Wouters (IMEC, Belgium), Sotirios Xydis (National Technical University of Athens, Greece), Chantal Ykman-Couvreur (IMEC, Belgium), Roberto Zafalon (STMicroelectronics, Italy).