INTRODUCTION
9
parameters (which is only possible by means of sensitivity analyses performed on a detailed
global model). The impossibility to correctly size reliability parameters according to structural
dependencies is very likely to take to over-dimension some components and under-dimension
some others, with a negative cost impact. A combination of techniques is therefore needed to
perform system level availability modeling of complex heterogeneous control systems,
considering both structural (i.e. hardware) and behavioral (i.e. performability) studies. While
a single highly expressive formalism (e.g. high-level Petri Nets) could be adopted to model
the entire system from an availability point of view, there are at least two issues which are
related to such choice: the first is efficiency, as the more expressive formalism usually
features more complex solving algorithms (e.g. belonging to the NP-hard class); the second is
ease of use, as the more expressive formalisms usually require a skilled modeler. The former
issue is such to impede the application of an entire class of formalisms to large systems (e.g.
for the state-space explosion problem). The explicit use of more formalisms (i.e.
multiformalism) allow modelers to fine tune the choice of the formalism to the needed
modeling power. Furthermore, implicit multiformalism allows modelers to specify new
formalisms out of existing ones, thus allowing to combine enhanced power, better efficiency
and increased ease of use (the complexity of the multiformalism solving process is hidden to
the modeler). Such a use of different formalisms must be supported by specifically developed
multiformalism support frameworks, allowing to specify, connect and solve heterogeneous
models. In this thesis we show how multiformalism approaches apply to critical control
systems for system availability evaluation purposes. We also dedicate a section to
multiformalism composition operators and their applications to the dependability modeling,
showing why they are needed for any kind of interaction between heterogeneous submodels.
While multiformalism approaches have been addressed by several research groups of the
scientific community a number of years ago, they have been usually applied to case-studies of
limited complexity in order to perform traditional analyses. In this thesis we show how to
employ multiformalism techniques in real-world industrial applications, in order to model and
evaluate dependability attributes with respect to system level failure modes.
The advantages of the new proposed methodologies are shown by applying them to real world
systems. In particular, in order to achieve a coherent and cohesed view of a real industrial
application, the case-studies are all related to a single railway control system specification: the
European Railway Traffic Management System / European Train Control System
(ERTMS/ETCS). ERTMS/ETCS is a standard specification of an innovative computer based
train control system aimed at improving performance, reliability, safety and interoperability of
European railways (in Italy, it is used in all the recently developed high-speed railway lines).
An implementation of ERTMS/ETCS is a significant example of what a complex
heterogeneous control system can be. At the best of our knowledge, no consistent study about
the dependability evaluation of ERTMS/ETCS is available in the research literature;
therefore, this thesis also serves as an innovative case-study description about ERTMS/ETCS.
Together with the theoretical studies, the results obtained in an industrial experience are
presented, which clearly underline the advantages of adopting the described techniques in the
V&V activities of a real-world system.
This thesis is organized as follows: Chapter I provides thesis motivation and state of the art of
simulative and (multi)formal dependability evaluation techniques. Existing dependability
(safety and availability) prediction techniques are briefly introduced and discussed, showing
their advantages and limitations and justifying the proposal for extensions needed to improve
their effectiveness and efficiency. Also, existing dependability studies of ERTMS/ETCS are
referenced.
Chapter II presents model-based approaches for the functional validation of complex critical
systems; the aim of such approaches is to find systematic software errors which impact on
system behavior; in particular, it is shown how it is possible to combine static and dynamic
INTRODUCTION
10
analyses in order to enhance both effectiveness and efficiency of functional testing, by means
of innovative hybrid and grey-box testing techniques.
Chapter III deals with multiformalism modeling techniques for dependability evaluation,
presenting advanced implicit and explicit multiformalism methodologies and their
applications, based on the OsMoSys
1
approach; in particular Repairable Fault Trees and
Bayesian Networks are introduced in order to enhance the evaluation of system level impact
of reliability and maintainability choices. Moreover, in this chapter we present the theoretical
and applicative aspects related to composition operators, which are used to make
heterogeneous models interact in a multiformalism support environment.
In Chapter IV a short architectural and behavioral description of the reference case-study (i.e.
ERTMS/ETCS) is provided. Then, several applications of dependability evaluation techniques
to different subsystems of ERTMS/ETCS are described, highlighting the results obtained in a
real industrial experience, that is to say the functional testing of ERTMS/ETCS for the new
Italian High-Speed railway lines. Several studies about the multiformalism evaluation of
ERTMS/ETCS availability are presented, showing their advantages with respect to traditional
approaches.
1
OsMoSys is a recently developed multiformalism/multisolution graphical and object-based framework.
MODEL-BASED DEPENDABILITY ANALYSIS OF CRITICAL SYSTEMS
11
Chapter I
Model-Based Dependability Analysis of Critical Systems
1. Context
Computer based control systems are nowadays employed in a variety of mission and/or safety
critical applications. The attributes of interest for such systems are known with the acronym
RAMS, which stands for Reliability Availability Maintainability Safety. The demonstration of
the RAMS attributes for critical systems is performed by means of a set of thorough
Verification & Validation (V&V) activities, regulated by international standards (see for
instance [31] and [133]). In short, to validate a system consists in answering the question “Are
we developing the right system?”, while to verify a system answers the question “Are we
developing the system right?”. V&V activities are critical both in budget and results, therefore
it is essential to develop methodologies and tools in order to minimize the time required to
make the system operational, while respecting the RAMS requirements stated by its
specification. In other words, such methodologies and tools must provide means to take to an
improvement both in effectiveness and efficiency of system development and validation
process.
Control systems are meant to interact with a real environment, reacting in reduced times to
external stimuli (see Figure 1); therefore, they are also known as reactive systems. This is the
reason why such systems are required to be real-time
2
, that is a possibly catastrophic failure
can occur not only if the system gives incorrect results, but also if correct results are late in
time [152].
Figure 1. General scheme of a control system.
Another feature of most control systems is that they are embedded (in opposition to “general
purpose”), that is their hardware and operating systems are application specific. The main
motivation of such a choice consists in enhanced reliability and survivability together with
shorter verification times, as system behavior is more easily predictable, being its structure
kept as simple as possible. In fact, dependability evaluation requires non straightforward both
structural and behavioral analyses.
Dependability is an attribute related to the trustworthiness of a computer system. According
to the integrated definition provided in [2], it consists of threats (faults, error, failures),
attributes (reliability, availability, maintainability, safety, integrity, confidentiality) and means
(fault prevention, tolerance, removal, forecasting). It has been introduced in recent times in
2
Critical control systems are often classified as “hard” real-time to distinguish them from “soft” real-time
systems in which there is no strict deadline, but the usefulness of results decreases with time (timeliness
requirements are less restrictive). In this thesis we generally refer to hard real-time systems.
CONTROL
SYSTEM
SENSOR
SYSTEM
ACTUATOR
SYSTEM
ENVIRONMENT
MODEL-BASED DEPENDABILITY ANALYSIS OF CRITICAL SYSTEMS
12
order to provide a useful taxonomy for any structured design and evaluation approach of
critical systems. A system is dependable if it can be justifiably trusted. Dependability threats
are related to each other: symptoms of a failure are observable at system’s interface (a service
degradation or interruption occurs); errors are alteration of system’s internal state which can
remain latent until activated and hence possibly leading to failures; faults are the causes of
errors, and they are also usually subject to a latency period before generating an error. Errors
can also be classified as casual, when they are caused by faults which are external to system
development process, or systematic, when faults are (unwillingly) injected during system
development process.
It is very useful to evaluate system dependability from early stages of system development,
i.e. when just a high level specification or a rough prototype is available, in order to avoid
design reviews and thus reduce development costs. Such early evaluation of system
dependability is often referred to as dependability prediction. Generally speaking, most of
dependability prediction techniques can be used at different stages of system development,
being as more accurate as more data is available about the system under analysis. In fact, the
same techniques are also used to evaluate the final system, in order to demonstrate (e.g. to an
external assessor) the compliance of system implementation against its requirements. When
used in early stages, dependability prediction techniques are very effective in giving an aid to
system engineers to establish design choices and fine tune performance and reliability
parameters; in fact, design reviews should be performed as soon as possible during system life
cycle, as lately discovered errors are far more costly and dangerous than early discovered
ones.
Two main approaches are employed in order to predict/evaluate the dependability of critical
control systems: the first is based on simulation based techniques, e.g. fault-injection [105]
at the hardware level (either physical or simulated) or software testing [23] at the various
abstraction and integration levels; the second is based on formal methods [49], which can be
used at any abstraction level (both hardware and software) and at any stage of system
development and validation. Simulation based techniques are aimed at approximating the
reality by means of simulative models, e.g. small scale hardware prototypes and/or general
purpose computers running software simulators, i.e. programs implementing external stimuli
as well as possibly simplified target systems. Formal methods follow for an alternative
strategy, consisting in creating and solving a sound mathematical model of the system under
analysis, or of just a part of it, using proper languages, well defined in both their syntax and
semantic. Formal methods produce a different approximation of reality, which gives more
precise and reliable results, but it is often less efficient and more difficult to manage. Graph-
based formal languages feature a more intuitive semiotic, improving the method under the
easy of use point of view. Both simulative and formal approaches are used in real world
applications, for different or same purposes, and can be classified as model-based
techniques, as they require designers to generate an accurate model both of the system under
analysis and of the external environment (i.e. interacting entities); moreover, they can be (and
often are) used in combination, with formal models possibly interacting with simulative ones
(an example of this is the class of approaches known as model-based testing [115]). A model
is an abstraction of real entities; therefore, it is important to ensure that it adequately reflects
the behavior of the concrete system with respect to the properties of interest (model validation
and “sanity checks” are performed at this aim).
Many standards and methodologies have been developed in order to guide engineers in
performing safety and reliability analyses of computer systems; however, they generally
suffer from one or more of the following limitations:
• They are either application specific or too general;
• The several proposed techniques are poorly cohesed;
• They are either not enough effective or efficient (i.e. they are not compatible with the
complexity of real industrial applications);
MODEL-BASED DEPENDABILITY ANALYSIS OF CRITICAL SYSTEMS
13
• Models are difficult to manage and/or the required tools are not user friendly.
Such limitations often constitute an obstacle in implementing them in the industry, where
traditional best practice techniques continue to represent the most widespread approaches.
However, as the requirements of modern control systems grow in number and criticality,
traditional techniques begin to reveal poorly effective and efficient. The increase in
requirements also reflects in highly distributed architectures, featuring enormously increased
classes of failure modes; such systems reveal very difficult to verify.
This is especially true when performing system level dependability prediction of complex
critical systems, in a verification and validation context (which means that external
constraints are given, according to requirements and guidelines of international standards). A
system level analysis does not replace the component based approach, in which constituent
level requirements have to be fulfilled; on the contrary, it joins and often bases on the results
of lower level analyses. The challenge stands in the fact that system level dependability
prediction is way more difficult to perform for the obvious growth in size and heterogeneity;
however, it is the only way to fulfill system level dependability requirements (this is
mandatory for safety requirements, while it can be not mandatory but convenient for
reliability requirements). A system level study requires the evaluation of two main aspects of
interest (of which we give here only intuitive definitions):
• System safety, related above all to functional aspects: system must behave correctly
(with no dangerous consequences for human beings and the environment) in any
operating condition, including degraded ones;
• System availability, related above all to structural aspects: system should be
operational for as much time as possible, also in presence of faults.
Availability and safety are often correlated according to two aspects: first, increasing system
safety level can decrease its availability, and vice versa; secondly, in many cases a poorly
available system is also an unsafe system, e.g. when availability has a direct impact on safety,
like in aerospace applications. The evaluation of both aspects requires structural as well as
behavioral analyses. An integration of techniques is needed for managing the complexity and
heterogeneity of modern computer-based control systems when evaluating their system level
dependability (in terms of safety and availability, the latter strictly related to reliability and
maintainability aspects); such integration is aimed at improving:
• Effectiveness: more (automated) analyses are possible, in order to answer as many
“what if?” (i.e. qualitative) and “how long?” (i.e. quantitative) questions as possible;
• Efficiency: already feasible analyses can be performed more quickly or can be applied
to larger systems (i.e. they scale up better);
• Easy of use: analyses can be performed by building and evaluating models in a more
intuitive and straightforward way, with positive impact on readability, reusability and
maintainability, which obviously reduce the time to market of the final product.
Performing a system level analysis requires combining a set of already existing model-based
techniques or developing entirely new ones. In order to achieve such aim, modular and
divide-et-impera approaches of modelling and analysis are necessary, possibly using bottom-
up (i.e. inductive) or top-down (i.e. deductive) composition/decomposition techniques. A
complex embedded computing system can be represented in its distributed and multi-layered
hardware and software structure as in Figure 2, in which rows represent complete subsystems
or simpler but independent devices. It usually features a single distributed hardware layer,
constituted by a set of devices interacting one with each other by means e.g. of
communication channels. Software layers are also interacting and can be assigned different
semantics, but in general they can be roughly associated to the levels of the ISO/OSI protocol
stack. Higher application layers usually implement system functional requirements and can be
further decomposable into specific sublevels for convenience of analysis. Of course, not all
devices of the system have to implement all levels, with simpler devices only featuring few of
them (this is one aspect of heterogeneity). Finally, each layer of each subsystem is constituted
MODEL-BASED DEPENDABILITY ANALYSIS OF CRITICAL SYSTEMS
14
by a varying set of interconnected components (e.g. CPU, communication interfaces, power
supply, bus, etc. for the hardware levels).
Figure 2. The distributed multi-layered architecture of a complex real-time system.
As aforementioned, system level dependability attributes of critical systems are more difficult
to predict with respect to single components, for complexity reasons. Usually verification and
validation techniques can be distinguished into two main subgroups: the ones aimed at
studying system availability (i.e. the RAM part of the RAMS acronym) and the ones aimed at
analysing system safety. In the two subgroups, both simulative and formal means of analysis
can be employed: e.g. fault injection techniques and fault tree analyses are both employed to
evaluate hardware level availability; functional testing and model-checking approaches are
used in order to evaluate software safety. However, concentrating on system level analyses, it
usually happens that for complex critical systems:
• formal techniques are suited and widespread for system-level availability prediction
(while simulative techniques based on fault-injection are used at the component level
in order to evaluate component reliability and coverage of fault-tolerance
mechanisms);
• simulation based techniques are suited and widespread for system-level safety
prediction (while formal techniques based on model-checking or theorem-proving
reveals almost always inadequate to deal with complex systems and are only used for
small subsystems).
Of course, it would be highly advantageous to perform system level safety analyses of
complex systems by means of formal methods, but this is far from industry common practice.
This notwithstanding, many of the formal methods introduced to predict system availability
can be also employed, at least partially, for qualitative or quantitative safety analyses.
The dynamic verification (by simulation) that a system in its whole behaves as expected, that
is it respects its functional requirements (comprising safety related ones), is know as system
testing, black-box testing or functional testing [70].
Grey-box testing approaches, in their various interpretations, support functional testing in
allowing test engineers to fine tune the test-set with the aim of an effective coverage of
functionalities with the minimum effort in time. The result is a significant reduction in test-set
complexity, while maintaining or improving test-effectiveness. In fact, while the main
advantage of functional testing techniques is that they are relatively easy to implement, the
main disadvantage consists in the difficulty of balancing test effectiveness and efficiency. As
effectiveness is difficult to predict, a thorough and extensive (thus costly and time consuming)
test specification and execution process is usually performed on critical systems. Given the
high number of variables involved, the required simulations (or test-runs) are prohibitive, thus
the process is necessarily either unfeasible or incomplete, with possible risks on system
safety.
MODEL-BASED DEPENDABILITY ANALYSIS OF CRITICAL SYSTEMS
15
The need for multiformalism modelling approaches [67] is a consequence of the limitations of
a single formalism when dealing with system level evaluations. Such limitations consist in the
following:
• inadequateness of expressive power of the formalism in modelling complex structures
or behaviors;
• inadequateness of solving efficiency in dealing with complex systems (e.g. the well
known problem of the “state space explosion”);
“Applied to computer systems development, formal methods provide mathematically based
techniques that describe system properties. As such, they present a framework for
systematically specifying, developing, and verifying systems.” (citation from [86]). Formal
methods are employed in a variety of industrial applications, from microprocessor design to
software engineering and verification (see [49] for a survey of most widespread methods and
their successful applications). Despite of such variety of methods and applications, no formal
method seems suitable to represent an entire system from both a structural and functional
point of view. While it is possible to find methods with which hardware complexity can be
managed, at least when modelling dependability related aspects, extensive formal modelling
of system level functional aspects seems still unfeasible using nowadays available tools. This
is the reason why well established formal methods are largely employed in industrial context
only for reliability analyses. For instance, Fault Trees (FT) and Reliability Block Diagrams
(RBD), are quite widespread for structural system modeling [146], while some kinds of
maintainability and behavioral analyses are possible by using Continuous Time Markov
Chains (CTMC) [74] and Stochastic Petri Nets (SPN) [102]. The latter two formalisms feature
reduced efficiency due to the state-based solving algorithms and are therefore unable to model
very complex systems. The Bayesian Network (BN) formalism [52] has been successfully
applied to dependability modeling in recent times (see [51], [96] and [5]); even though its
solving algorithms are demonstrated to be NP-hard, they feature better efficiency with respect
to GSPN, being non-state based [6]. A comparison between the mentioned formalisms is
provided in Figure 3.
Figure 3. Comparison of different reliability modelling formalisms.
Multiformalism approaches are very interesting for their ability to balance the expressive
power of formal modeling languages and the computational complexity of solving algorithms.
Unfortunately and despite of their huge potential, multiformalism techniques are still not
widespread in industrial practice. The main obstacles are the difficulty of use and the limited
efficiency. Moreover, the research community is far from developing a user friendly,
comprehensive and efficient framework for managing multiformalism models. Some
approaches that move toward the achievement of such an objective show the numerous
advantages in adopting multi-formal techniques for many kinds of system analyses (see e.g.
the OsMoSys [150] and Möbius [44] multiformalism frameworks). This also highlights the
necessity of developing flexible composition operators for more strict interactions between
submodels, allowing for more comprehensive and detailed analyses. A theoretical
multiformalism framework based on composition operators would be advantageous in
integrating heterogeneous models (both simulative and formal) within a single cohesed view,
FT, RBD
BN
CTMC,
GSPN
MODEL-BASED DEPENDABILITY ANALYSIS OF CRITICAL SYSTEMS
16
which could be employed in order to obtain results which are currently almost impossible to
achieve. Composition operators are therefore orthogonal to both simulative and formal safety
and availability evaluation techniques.
Figure 4 provides a synthesis of the context under analysis, by integrating different views. In
particular:
• blue links refer to multiformalism evaluation of structural availability;
• orange links refer to model-based static functional verification techniques;
• red links refer to model-based dynamic functional verification techniques;
• green links refer to performability evaluations;
• shaded rectangles represent threats, attributes and means of interest.
Black links represent other approaches, generally non suitable or advantageous for system
level analyses (e.g. fault-injection and model-checking). While it seems possible to apply
multiformalism techniques for system structural safety evaluation, this is not necessary in
most of the cases (structural safety can be evaluated using more straightforward techniques
basing on “safe” simplifying assumptions). Analogously, while functional testing can help
detecting anomalies and non conformities having impact on system availability, it cannot
prove the absence of deadlocks. The dashed red line between “specification” and “validation”
refers to the possibility of the model-based dynamic analysis techniques to also detect the
frequent natural language specification flaws, which are obviously impossible to detect using
traditional techniques only based on the informal requirements.
DEPENDABILITY
FAULTS
THREATS
ERRORS
FAILURES
AVAILABILITY
ATTRIBUTES
RELIABILITY
SA FE T Y
CONFIDENTIALITY
INTEGRITY
MAINTAINABILITY
TOLERANCE
MEANS
PREVENTION
FORECAST ING
REMOVAL
SYSTEM
FAILURE
SAFETY RELATED
AVAILABILITY RELATEDDUE TO CASUAL FAULTS
DUE TO SYSTEMATIC FAULTS
HARDWARE
SOFT W ARE
FORMAL
SIMULATIVE BEHAVIORAL
ST RUCT URAL
SPECIFICATION
DESIGN
HARDWARE
SOFT W ARE
IMPLEMENTATION
VERIFICAT ION
VALIDATION
LIFE CYCLE VIEW
SYSTEM VIEW
DEPENDABILITY VIEW
DYNAMIC
ST AT IC
Figure 4. Different views on the context under analysis.
2. Motivation
This thesis provides an integration of existing techniques and proposes new approaches to
cope with system level dependability evaluation issues of complex and critical computer-
based control systems. An application of the new approaches is also shown for a modern
railway control system. In the following of this section we explain why traditional safety and
availability evaluation techniques reveal inadequate and what are the means we propose in
this thesis in order to overcome such limitations.
Computer based control systems have growth in complexity and criticality. The complexity
growth is an effect of larger, more distributed (even on large territories) and heterogeneous
architectures. Traditional techniques which have been used in the past to predict and evaluate
their dependability have revealed inadequate to manage such increased complexity while
retaining power of analysis and accuracy of results.
MODEL-BASED DEPENDABILITY ANALYSIS OF CRITICAL SYSTEMS
17
A critical computer system is needed to be above all available and safe. Availability and
safety are conditioned both by software and hardware contributions. In particular, let us refer
to an extract of a “V” development model taken from one of the RAMS standards for critical
systems [33], which is shown in Figure 5.
Figure 5. An extract of the “V” development cycle for critical control systems.
As we can see from the diagram of Figure 5, hardware and software validation are performed
in two distinct phases, and finally integration and system testing is performed. At each
validation stage, system hardware or software must be validated against safety and availability
related failure modes considering both casual and systematic errors and using simulation or
formal means of analysis. In order to manage complexity, incremental and differential
approaches are suggested by V&V standards, which allow to independently test hardware and
software, at different verification stages. According to such approaches, when evaluating
dependability attributes of system software, engineers can neglect casual errors, as hardware
inducted software errors are demonstrated to be tolerated by fault tolerance mechanisms with
a very low probability of undetected failures (typically lower than 10
-9
F/h; see [13]).
From the above considerations, we can infer that, assuming a validated hardware, software
safety and availability are only related to functional correctness (e.g. absence of deadlocks).
Functional correctness at a software level is obviously necessary to assure system safety. It
can be evaluated using functional testing (a simulative approach) or model-checking (a formal
approach); however, model checking techniques are very hard to be effectively applied to
complex control systems, for they suffer from the state space explosion problem. Even when
they are applicable (at least on subsystems or protocols), they only give a complementary
contribution in system verification, as witnessed by past experiences [77]. This is the reason
why in this thesis we stress a model-based functional analysis approach to cope with software
functional validation: the proposed semi-formal approach balances the advantages of a formal
analysis (which helps to ensure correctness) and the flexibility of informal ones. The
functional analysis can be both static and dynamic: static analysis does not need software
execution, while dynamic analysis requires simulation. The model-based static and dynamic
analysis techniques proposed in this thesis help finding more errors more quickly with respect
to traditional techniques, e.g. only based on code inspection and black-box testing. Both
types of functional analysis share the goal of finding software systematic errors (also known
as “defects”). The proposed model-based functional analysis process involves several
activities traditionally related to functional (or system) testing and features the following
original contributions:
• the construction of reference models, that guide test engineers throughout the
functional analysis process;
• the static verification and refactoring techniques, based on reverse engineering and on
the Unified Modeling Language;
MODEL-BASED DEPENDABILITY ANALYSIS OF CRITICAL SYSTEMS
18
• the definition and reduction techniques of the input and state variables influencing
system behavior in each operating scenario;
• the test-case reduction rules, some of which are based on the functional decomposition
of the system under test;
• the abstract testing approach, which allows for a configuration independent test
specification and an automatic configuration coverage by means of the proposed
instantiation algorithm.
These features make the proposed methodology:
• model-based, as models of the system under verification are needed to support the
functional analysis process;
• hybrid, as besides verifying system implementation, its specification is also partly
validated by means of model-based analyses (missing and ambiguous requirements
can be easily detected);
• grey-box, as it needs to access system internal status for the software architecture
analysis and for logging of the variables of interest;
• abstract, as test specification results are not linked to a particular installation of the
control system.
The functional analysis methodology introduced above is described in Chapter II.
As mentioned above, the hardware validation against safety is based on well established
techniques, meant to reduce to a largely acceptable level the probability of occurrence of
unsafe hardware failures. As unsafe failure modes are quite easy to be defined and combined,
system level safety analyses are feasible with usual techniques (e.g. Fault Tree Analysis)
basing on “safe” simplifying assumptions. The hardware validation against availability is
instead problematic at a system level when non trivial failure modes are defined, each one
requiring specific analyses. The choice of using more reliable components, which is
sometimes possible by respecting constituent level reliability requirements, can not be
considered as satisfactory, as cost would increase uncontrollably and the overall system
availability with respect to a certain failure mode would be still undefined. Moreover, even
though hardware is reliable, it is not guaranteed to be performable enough, hence timing
failures can still occur. Therefore, the risk is to either oversize components or overestimate
system availability. When data of measurements from real system installation is available, it
could be too late or too costly to modify design choices. Therefore, predicting system
hardware availability from early design stages is very important. However, traditional
modeling techniques are either too weak in expressive power or poorly efficient to be
employed for a system level analysis, especially when taking into account components
dependencies (e.g. common modes of failure) and articulated maintainability policies. A
multiformalism approach, allowing to fine tune effectiveness and efficiency while modeling
the overall system by different formalisms, is the solution we propose in this thesis to cope
with the problem of hardware availability prediction.
Composition operators are needed for a more strict interaction between submodels in explicit
multiformalism applications. A highly cohesed system-level model, which is only possible
using composition operators, allows for any kind of analyses (e.g. “what if?”), both functional
and structural (e.g. to evaluate system level impact of component related parameters or design
choices); therefore such operators are orthogonal to both simulative and formal model-based
techniques, both for safety and availability evaluation. This is the reason why we perform in
this thesis a study of multiformalism composition operators, referring in particular to a
possible implementation in the OsMoSys framework. While all multiformalism frameworks
provide the possibility of implementing some kinds of model connection, no study has been
performed to exhaustively present the issues related to composition operators (e.g. needed
features, attributes, preservation of properties, etc.); furthermore, composition operators have
not been extensively addressed in the OsMoSys framework yet. An integration of such
MODEL-BASED DEPENDABILITY ANALYSIS OF CRITICAL SYSTEMS
19
operators in the already existing OsMoSys multiformalism multi-solution framework is then
proposed in this thesis, by exploiting the specific features of a graph-based object oriented
architecture and user interface. However, we do not dare to overcome all the difficulties when
trying to check properties on large composite models, which is one of the most challenging
among still open problems in the formal methods research community. The solution of such
models itself constitutes a non straightforward problem which we address. Multiformalism
dependability evaluation is discussed in Chapter III of this thesis.
The methodological contributions presented in this thesis have all been validated by applying
them to the same case-study, which is complex enough to provide plenty of possible analyses,
both from structural and behavioral points of view. In particular, the case-study consists in the
European Railway Traffic Management System / European Train Control System
(ERTMS/ETCS), a European standard aimed at improving performance, reliability, safety and
interoperability of transeuropean railways [147] which is used in Italy on High Speed railway
lines (the “Alta Velocità” system). ERTMS/ETCS specifies a complex distributed railway
control system featuring strict safety and availability requirements. One of the innovations of
this thesis is the presentation of useful results about how to perform system testing and how to
size reliability and maintainability parameters of an ERTMS/ETCS implementation. Without
the methodologies presented in this thesis, such complex system would have been almost
impossible to validate, for the difficulty of managing its complexity when assessing its system
level safety and availability using traditional approaches. These results, which are all original
contributions, are provided in Chapter IV.
3. State of the art in dependability evaluation of critical systems
As stated in previous chapter, the use of differential and incremental approaches appears to be
the only way to manage complexity and criticality of modern control systems. We recall that
the approach followed in this thesis is based on the separation of concerns between:
a. System level functional analysis, for the detection of software systematic errors
possibly impacting on system safety;
b. Evaluation of system level RAM attributes, with respect to failure modes due to casual
hardware faults or performance degradations.
Point (a) is addressed in this thesis by using model-based static and dynamic functional
analysis approaches, while point (b) is addressed by means of multiformalism techniques; in
this section we then divide our state of the art discussion between such two aspects,
respectively discussed in next subsections §3.1 and §3.2.
3.1. Functional analysis of critical systems
Functional analysis of critical systems is the both static and dynamic activity aimed at
detecting systematic software errors. Besides module testing, which is usually performed in
the downstream phase of the “V” life-cycle diagram, two main activities are widespread in
industrial V&V practice:
• Code inspection (a static analysis technique)
• Functional testing (a dynamic analysis technique)
The first activity is also known as code review or walkthrough and is usually based on
checklists [16]. Its aim is to reveal misuse of code statements and detect errors which are
evident from a static analysis of code structure. Such activity is important not only to perform
a gross grain error detection, but also to find code defects which often remain undetected
during the following dynamic analysis stage. In fact, the limited execution time of system
tests is usually unable to reveal latent errors due to wrong assignment of values to variables or
pointers. However, static analysis of software could be much more effective, as often errors
are revealed in the following dynamic analysis stage that could have been detected by means
of a more accurate static analysis. Of course, lately discovered errors are more costly and
difficult to correct. A way to improve code inspection both in effectiveness and efficiency to