Where academic tradition
meets the exciting future

Formal Development of Resilient Distributed Systems

Inna Pereverzeva, Formal Development of Resilient Distributed Systems. TUCS Dissertations 203. Åbo Akademi University, 2015.

Abstract:

Resilience is the property of a system to remain trustworthy despite changes. Changes of a different nature, whether due to failures of system components or varying operational conditions, significantly increase the complexity of system development. Therefore, advanced development technologies are required to build robust and flexible system architectures capable of adapting to such changes. Moreover, powerful quantitative techniques are needed to assess the impact of these changes on various system characteristics.

Architectural flexibility is achieved by embedding into the system design the mechanisms for identifying changes and reacting on them. Hence a resilient system should have both advanced monitoring and error detection capabilities to recognise changes as well as sophisticated recon figuration mechanisms to adapt to them. The aim of such reconfiguration is to ensure that the system stays operational, i.e., remains capable of achieving its goals.

Design, verification and assessment of the system reconfiguration mechanisms is a challenging and error prone engineering task. In this thesis, we propose and validate a formal framework for development and assessment of resilient systems. Such a framework provides us with the means to specify and verify complex component interactions, model their cooperative behaviour in achieving system goals, and analyse the chosen reconfiguration strategies. Due to the variety of properties to be analysed, such a framework should have an integrated nature. To ensure the system functional correctness, it should rely on formal modelling and verification, while, to assess the impact of changes on such properties as performance and reliability, it should be combined with quantitative analysis.

To ensure scalability of the proposed framework, we choose Event-B as the basis for reasoning about functional correctness. Event-B is a state based formal approach that promotes the correct-by-construction development paradigm and formal verification by theorem proving. Event-B has a mature industrial-strength tool support -- the Rodin platform. Proof-based verification as well as the reliance on abstraction and decomposition adopted in Event-B provides the designers with a powerful support for the development of complex systems. Moreover, the top-down system development by refinement allows the developers to explicitly express and verify critical system-level properties.

Besides ensuring functional correctness, to achieve resilience we also need to analyse a number of non-functional characteristics, such as reliability and performance. Therefore, in this thesis we also demonstrate how formal development in Event-B can be combined with quantitative analysis. Namely, we experiment with integration of such techniques as probabilistic model checking in PRISM and discrete-event simulation in SimPy with formal development in Event-B. Such an integration allows us to assess how changes and different reconfiguration strategies affect the overall system resilience. The approach proposed in this thesis is validated by a number of case studies from such areas as robotics, space, healthcare and cloud domain.

Files:

Full publication in PDF-format

BibTeX entry:

@PHDTHESIS{phdPereverzeva_Inna15a,
  title = {Formal Development of Resilient Distributed Systems},
  author = {Pereverzeva, Inna},
  number = {203},
  series = {TUCS Dissertations},
  school = {Åbo Akademi University},
  year = {2015},
  ISBN = {978-952-12-3253-4},
  ISSN = {1239-1883},
}

Belongs to TUCS Research Unit(s): Embedded Systems Laboratory (ESLAB)

Edit publication