Dependable Systems - A.Y. 2018-19

Computer Science and Engineering

learning goals

System dependability is the ability of the system to deliver the expected functionality, fulfilling the functional and performance requirements, during its operational lifetime.
This course provides a methodological approach to system dependability by introducing the basic concepts in terms of dependability attributes, fault/failure models, methods to design and analyze this class of systems, also presenting practical solutions for their realization.

At the end of the course the student shall be able to take into account dependability aspects in the design and implementation of a system, by investigating dependability requirements in relation to the applicable fault model, and to identify suitable solutions to achieve the desired ability to manage the occurrence of faults.

general information

Cristiana Bolchini
Phone: (02 2399) 3619
Email: cristiana . bolchini @ polimi . it

 

Marco Gribaudo
Email: marco . gribaudo @ polimi . it

 

Manuel Roveri
Email: manuel . roveri @ polimi . it

class hours

Tuesday 08:15 - 10:15 | D2.5
Thursday 08:15 - 10:15 | D2.5

course content

The following topics will be presented and discussed:

  1. Dependability basics
    • Fault/Error/Failure models
    • Dependability attributes: Reliability, Availability
  2. Dependability Analysis (quantitative and qualitative)
    • Models (Failure rate, Probability distributions)
    • Series/Parallel Systems
    • Markov Models
    • Fault Trees
    • FMEA/FMECA
    • Fault Injection
    • Diagnosis
  3. Design for dependability
    • Fault Detection, Tolerance and Recovery
    • Model and data analysis

references

  • Dhiraj K. Pradhan, Fault-tolerant computer system design, Prentice-Hall, 1996, ISBN 0-13-057887-8
  • M.L. Shooman, Reliability of Computer Systems and Networks: Fault Tolerance, Analysis, and Design, Wiley, 2002, ISBN 0-471-29342-3
  • Israel Koren and C. Mani Krishna, Fault-tolerant Systems, Morgan Kaufmann, 2007
  • Dependable Multicore Architectures at Nanoscale, M. Ottavi, D. Gizopoulos, S. Pontarelli Eds, Springer Int. Publishing, 2017
further references on specific topics
  • First 3 chapters of the upcoming book web page
  • Rolf Isermann. Fault-diagnosis systems: an introduction from fault detection to fault tolerance. Springer, 2006.
  • Janos J. Gertler. Fault-Detection and Diagnosis in Engineering Systems. Marcel Decker, 1998.
  • Jie Chen, Ron J. Patton. Robust model-based fault diagnosis for dynamic systems. Kluwer Academic Publishers, 1999.

course evaluation

Student evaluation consists in an oral exam on the topics presented during the semester. As an alternative, students may decide to develop a project.

tentative class plan

cristiana bolchinimarco gribaudomanuel roveribreakother

date topic reference
Feb 26, 2019Course introduction and overviewpdf
Feb 28, 2019Dependability definition - attributes - indices - fault / error / failure pdf
Mar  5, 2019Graduation day
Mar  7, 2019Fault/error model - Reliability / Availability / Safety
Mar 12, 2019Fault types / abstraction levels - Part Ipdf
Mar 14, 2019Fault types / abstraction levels - Part II
Mar 19, 2019Probability models and distributionspdf
Mar 21, 2019Markov Chainspdf
Mar 26, 2019Transient analysis with TCMCs – Part Ipdf
Mar 28, 2019Transient analysis with TCMCs – Part IIpdf
Apr  2, 2019Transient analysis with TCMCs – Part III
Apr  4, 2019Dependability Analysis: Reliability Block Diagrams FMEA, Fault Treespdf
Apr  10, 2019Dependability Analysis: Reliability Block Diagrams FMEA, Fault Trees | part II
Apr 11, 2019Dependability Analysis: Fault Injectionpdf
Apr 16, 2019Graduation day, Easter break, Apr. 25
......
Apr 25, 2019Graduation day, Easter break, Apr. 25
Apr 30, 2019Design for dependability: model / data analysis – Part Ipdf
May  2, 2019Design for dependability: model / data analysis – Part II
May  7, 2019Design for dependability: model / data analysis – Part IIIpdf
May 9, 2019Design for dependability: model / data analysis – Part IV
May 14, 2019Design for dependability: fault management – Part Ipdf
May 16, 2019Design for dependability: fault management – Part II
May 21, 2019Aging and lifetime extensionpdf
May 23, 2019Design for dependability: fault management – Part III
May 28, 2019Design for dependability: hw/sw hardening pdf
May 30, 2019Course closing