.. _protocol-design: Protocol design with Alloy ************************** .. index:: distributed protocol; leader election In this chapter we will explain how to specify and analyse a distributed protocol using Alloy. We will use a very simple example of a leader election protocol. The aim of a leader election protocol is to designate one node in a network as the *organizer* or *leader* of some distributed task. Nodes are initially unaware of which node will play that role, but after running the protocol all nodes should recognise the same node as leader. Many different leader election protocols exist: here we will use a well-known version that can be used to elect a leader in a network of nodes organised in a ring, and where nodes also have unique identifiers. This is an example commonly used when presenting idioms for behaviour specification in Alloy [MIT12]_. This protocol was proposed by Chang and Roberts [CACM79]_ and roughly operates as follows: the goal is to elect as leader the node with the highest identifier; each node starts by sending its own identifier clockwise; upon receiving an identifier, a node only forwards it if it is higher than its own; a node receiving back its own identifier designates itself as leader; finally the leader broadcasts its newly found role to the network. In our specification we will not model this last (trivial) step, and will only be concerned with the fundamental property of a leader election protocol: eventually exactly one (exactly one) node will designate itself as the leader. Specifying the network configuration ==================================== When designing a distributed protocol, we should start by specifying the network configurations in which it is supposed to operate. In this case, the protocol operates in a ring network of nodes with unique identifiers. Notice that many different network configurations satisfy this requirement: we could have rings of different size, and for the same size, different assignments of the unique identifiers to the nodes (the relevant fact here is the relative order in which the identifiers appear in the network). The verification of the expected properties of the protocol should take into account all possible different configurations (up to the specified bound on the ring size). The network configuration, although arbitrary, does not change during the execution of the protocol. As such, it will be specified with immutable sets and relations. Unlike some formal specification languages, Alloy has no pre-defined notion of *process* or *node*. These have to be explicitly specified, by declaring a signature. Since the network of nodes forms a ring, a binary relation that associates each node with its immediate successor in the ring should also be declared. .. code-block:: sig Node { succ : one Node } .. index:: command; run Recall that binary relations should be declared as fields inside the domain signature. In the declaration of field :code:`succ` the multiplicity :code:`one` is used to ensure that each node has exactly one successor in the ring. As usual in Alloy, we can immediately start validating our specification by asking for example instances using a :code:`run` command. .. code-block:: run example {} By iterating over the returned instances with :guilabel:`New` we could get the following instance, which is not a valid ring. .. image:: instance1.png :width: 400 :align: center In addition to requiring each node to have exactly one successor, a simple way to ensure that a network forms a ring is to require that every node is reachable from any other node. Given a node :code:`n`, to determine the set of nodes reachable from :code:`n` via relation :code:`succ` we can use expression :code:`n.^succ`. This expression makes use of the composition (:code:`.`) and transitive closure (:code:`^`) operators already presented in chapter :ref:`structural-design`. The constraint that :code:`succ` forms a ring can thus be specified in a fact as follows. .. code-block:: fact { // succ forms a ring all n,m : Node | m in n.^succ } An alternative specification is to require that the set of all nodes (denoted by :code:`Node`) is reachable from every node. .. code-block:: fact { // succ forms a ring all n : Node | Node in n.^succ } If we now iterate over all instances returned by command :code:`example` we will get the 3 possible rings with up to 3 nodes and a configuration without any node, which is also allowed by our specification. The limit of 3 is due to the implicit command scope of 3 atoms per top-level signature. As an example we show the ring with 3 nodes. .. image:: instance2.png :width: 400 :align: center To enforce that we have at least one node we can add the following fact. .. code-block:: fact { // at least one node some Node } .. index:: total order Besides being organised in a ring, nodes must also have unique identifiers. Since we are going to need to compare identifiers, one possibility would be to just use integers to represent them. However, besides being totally ordered, integers carry additional semantics that is not necessary here. In particular, integers support arithmetic operations that will not be required in this example. When developing a formal specification it is good practice to be as abstract as possible, in particular avoid adding unnecessary restrictions. Also, the analysis of Alloy specifications with integers has some subtleties which makes it better to avoid, unless strictly necessary (see :ref:`integers` for a proper discussion). As such, we will introduce a signature :code:`Id`, to denote identifiers, and a binary relation :code:`next` to capture the total order between them. .. code-block:: sig Id { next : lone Id } Given an identifier, relation :code:`next` shall yield the next identifier in the total order. All identifiers except the last one have a :code:`next`, so the multiplicity of this relation should be :code:`lone`. .. index:: relational operator; reflexive transitive closure To constrain :code:`next` to entail a total order over :code:`Id` we could, for example, declare singleton signatures to denote the :code:`first` and :code:`last` identifiers, require the :code:`last` to have no :code:`next`, and all identifiers to be reachable from the :code:`first` via the (reflexive) transitive closure of :code:`next`. .. code-block:: one sig first, last in Id {} fact { // Id is totally ordered with next no last.next Id in first.*next } .. index:: signature; subset signature signature; sub-signature Here we used :code:`in` instead of :code:`extends` to declare the singleton sets :code:`first` and :code:`last`. Recall that extension signatures are disjoint, while inclusion signatures are arbitrary subsets of the parent signature. In this case we what the latter because when we have just a single :code:`Id`, :code:`first` and :code:`last` should be the same identifier. In the last constraint we used operator :code:`*` to compute the reflexive transitive closure of :code:`next`, to get all identifiers that are reachable in zero or more steps via :code:`next` (the transitive closure would yield all identifiers reachable in one or more steps). The reflexive transitive closure is defined as :code:`*next = ^next + iden`, the union of the transitive closure and the predefined identify binary relation, that relates each element of our universe of discourse to itself. .. index:: ! constant ! constant; iden ! constant; univ ! constant; none relational operator; Cartesian product relational operator; domain restriction relational operator; range restriction type system; arity error .. warning:: Understanding the predefined Alloy constants. There are three predefined (relational) constants in Alloy: * :code:`none`, that denotes the empty set. * :code:`univ`, the set of all atoms in all signatures. * :code:`iden`, the binary relation that associates each atom in :code:`univ` to itself. It is important to know that when there are mutable top-level signatures (declared with :code:`var`), both :code:`univ` and :code:`iden` are mutable also, with the former containing, in each state, all atoms contained in all signatures at that state. It is also important to remember that :code:`iden` relates every element of :code:`univ` to itself. This means that, for example, to check that a binary relation :code:`R` (defined on a signature :code:`A`) is reflexive it is not correct to write :code:`iden in R`, as :code:`iden` will not only contain elements in :code:`A` but also elements in other declared top-level signatures. To obtain the identity relation on signature :code:`A` we could, for example, intersect it with the binary relation containing all pairs of elements drawn from :code:`A`, computed with the Cartesian product operator :code:`->`, as follows: :code:`iden & (A -> A)`. Alloy also has an operator to restrict the domain of a relation that can be used for the same effect: :code:`A <: iden` filters relation :code:`iden` to obtain only the pairs whose first element belongs to :code:`A`. Dually, we could have restricted the range: :code:`iden :> A` filters relation :code:`iden` to obtain only the pairs whose last element belongs to :code:`A`. Thus, to check that the relation :code:`R` is reflexive we could write, for example, :code:`(A <: iden) in R`. Another thing to recall is that :code:`none` denotes the empty set and is a unary relation. So, if we write :code:`R = none`, to check if :code:`R` is empty, we will get a type error since the relations have different arities (:code:`none` is unary and :code:`R` is binary). The empty binary relation is denoted by the Cartesian product :code:`none -> none`: this expression denotes the empty set like :code:`none`, but for the type system it is a binary relation, so writing :code:`R = (none -> none)` does not raise a type error. Of course, a simpler way to check that :code:`R` is empty is to write :code:`no R`: the cardinality check keywords like :code:`no` can be used with relational expressions of arbitrary arity. Having our identifiers totally ordered, we can now declare a field :code:`id` inside signature :code:`Node`, to associate each node with its identifier. .. code-block:: sig Node { succ : one Node, id : one Id } To ensure that identifiers are unique it suffices to require :code:`id` to be injective, which can be done as follows. .. code-block:: fact { // ids are unique all i : Id | lone id.i } If we change the :code:`example` command to ask for all configurations with exactly 3 nodes and 3 identifiers .. code-block:: run example {} for exactly 3 Node, exactly 3 Id .. index:: symmetry breaking and iterate over all instances with :guilabel:`New`, we will get the two only (truly) different configurations with this size: one where the identifiers increase as we move clockwise in the ring, and another where they decrease. The former is depicted below. This illustrates the power of Alloy's symmetry breaking, which in this case prevents the return of any other instance (that necessarily would be isomorphic to one of these two). .. image:: instance3.png :width: 400 :align: center Using modules to structure a specification ========================================== .. index:: module Requiring a signature to be totally ordered is quite common. To simplify the specification of such common patterns, and also to enable breaking a large specification into more manageable pieces, Alloy supports the concept of *module*, which behaves similarly to modules in other specification and programming languages. It is possible to create a module with some definitions and then import it in another module. For example we could define a module for totally ordered identifiers as follows. .. code-block:: module orderedids sig Id { next : lone Id } one sig first, last in Id {} fact { // Id is totally ordered with next no last.next Id in first.*next } The first line in a module consists of the keyword :code:`module` followed by the module name. Module names should match the path of the respective file, so we should save it the with name :code:`orderedids.als` and be in the same directory as the module that will import it. Having factored out these declarations into module :code:`orderedids`, we can now delete them from our main module, and instead import this new module by adding the following :code:`open` statement at the beginning of the specification. .. code-block:: open orderedids .. index:: module; parameterised Module :code:`orderedids` is not ideal, since it commits to a specific signature name to be totally ordered. To make it more reusable, we could instead parametrise it by the signature one wishes to be totally ordered. This poses a problem, because our formalisation of a total order requires declaring field :code:`next` inside the signature to be totally ordered, and modules cannot change the definition of the parameter signature. There are some tricks to solve this problem. One possibility is to declare an auxiliary inclusion signature inside the parameter signature, force it to be exactly the same as the parent signature using a :code:`fact`, and declare relation :code:`next` inside that auxiliary signature instead. The definition of our parametrised module (now called just :code:`ordered`) would look as follows. .. code-block:: module ordered[A] sig Aux in A { next : lone Aux } fact { Aux = A } one sig first, last in A {} fact { // A is totally ordered with next no last.next A in first.*next } To use this parametrised module, we should declare back signature :code:`Id` in our main specification, and pass it as a parameter in the :code:`open` statement. The main specification should now look as follows. .. code-block:: open ordered[Id] sig Node { succ : one Node, id : one Id } sig Id {} fact { // at least one node some Node } fact { // succ forms a ring all n : Node | Node in n.^succ } fact { // ids are unique all i : Id | lone id.i } run example {} for exactly 3 Node, exactly 3 Id .. index:: module; predefined module; ordering Alloy has some predefined modules in a special :code:`util` namespace. One of them is :code:`util/ordering`, a parametrised module that behaves similarly to our module :code:`ordered`, imposing a total order on a signature passed as parameter. This means that instead of defining our :code:`ordered` module we could simply import this module with the statement .. code-block:: open util/ordering[Id] This predefined module is specified in a slightly different way but, as :code:`ordered`, it exposes singleton sets :code:`first` and :code:`last` and a binary relation :code:`next` that defines the total order. However, if we run the :code:`example` command we get the following result. .. image:: instance4.png :width: 400 :align: center Due to the way they are defined in :code:`util/ordering`, relations :code:`first`, :code:`last`, and :code:`next` are not shown in the visualizer. In principle, this would make it very difficult to understand how the identifiers are ordered, but module :code:`util/ordering` is handled in special way be the Alloy Analyzer, which will attempt to name the inhabitants of the ordered signature sequentially according to the order defined by :code:`next`. In our instance this means that :code:`Id0` is the :code:`first`, which is followed by :code:`Id1`, and then :code:`Id2`, which is the :code:`last`. If we really want to confirm the value of the :code:`next` relation we could go to the evaluator and ask for its value. However, we will get an ambiguity error since :code:`next` is also defined in an integer module that is imported by default. To disambiguate we could type the qualified name :code:`ordering/next` instead. .. image:: evaluator1.png :width: 500 :align: center To see what other functions and derived relations are declared in :code:`util/ordering` you can open it in the menu :menuselection:`File --> Open Sample Models...`. For example, we have functions :code:`gt` and :code:`lt` to check if one element is greater than or less than another, respectively, and relation :code:`prev` that returns the predecessor of an element, if it exists. .. warning:: Be aware of exact scopes with :code:`util/ordering`! Besides naming the atoms according to the total order, module :code:`util/ordering` has another peculiarity concerning the analysis. The scope of the signature that is passed as parameter to :code:`util/ordering` is always exact even if not declared with :code:`exactly`. This makes analysis faster but one should always be aware of this when running commands, as it may cause :code:`run` commands to be trivially unsatisfiable and :code:`check` commands to trivially have no counter-examples. Consider the following specification, where we have model a finite set of natural numbers, distinguish between even and odd numbers, and require the first to be even and the last to be odd. .. code-block:: open util/ordering[Nat] abstract sig Nat {} sig Even, Odd extends Nat {} fact { all n : Even | n.next in Odd all n : Odd | n.next in Even first in Even last in Odd } run example {} If we execute the command :code:`example` we get no instance! This is due to the fact that the default scope for :code:`Nat` is 3, and since it is being totally ordered with :code:`util/ordering` this scope is required to be exact. Unfortunately, with exactly 3 atoms in :code:`Nat` it is impossible to satisfy the given constraints. If we change the scope of :code:`example` to 4 then we will obtain one instance. .. image:: instance5.png :width: 400 :align: center This is an example where the naming of the atoms does not help in understanding what is the total order, since the ordered signature :code:`Nat` is abstract and extended by :code:`Even` and :code:`Odd`. By using the evaluator we can see that the total order is :code:`Even0`, :code:`Odd1`, :code:`Even1`, and lastly :code:`Odd0`. .. image:: evaluator2.png :width: 500 :align: center As a last note on scopes, for efficiency reasons the :code:`util/ordering` module also requires the parameter signature to be non-empty (since it forces :code:`first` and :code:`last` to always be defined), meaning that setting its scope to 0 results in an inconsistent specification. .. _specifying-protocol-dynamics: Specifying the protocol dynamics ================================ .. index:: fields; field declaration fields; mutable fields var keyword; mutable fields The coordination between the different nodes in this distributed protocol, like in many others, is achieved by means of message passing. Like nodes, Alloy has no special support for message passing, so all the relevant concepts have to be specified from scratch. In this case messages are just node identifiers, so there is no need to add a signature to model those. To capture the incoming and outgoing messages we can declare mutable binary fields inside signature :code:`Node`, to associate each node with the set of identifiers it has received or that need to be sent, respectively. To declare a mutable field the keyword :code:`var` should be used. .. code-block:: sig Node { succ : one Node, id : one Id, var inbox : set Id, var outbox : set Id } We also need to capture the nodes that have been elected as leaders. To do so we can declare a mutable signature :code:`Elected` that is a subset of :code:`Node`. .. code-block:: var sig Elected in Node {} If we now run the :code:`example` command we might get the following instance. .. image:: instance6.png :width: 700 :align: center .. index:: Alloy Analyzer; visualizer trace Once a specification has mutable elements, its instances take the shape of traces. The first thing to be noticed is that the interface of the visualizer changed to adapt to this fact: it now shows two instance snapshots side-by-side, there are several new instance exploration buttons in the toolbar, and a depiction of a path just below those. This new interface is due to the addition of mutable relations in our specification. As expected, the value of a mutable relation may change over time, so an instance of our specification can no longer be captured by a single static valuation of the declared signatures and fields. Instead, instances are now infinite sequences of states (aka paths or traces), being each state a (possibly different) valuation for the declared signatures and fields. As such, the visualizer now shows a (finite) depiction of the (infinite) trace, numbering the different states, and including a loop back that marks a segment of the trace that will repeat indefinitely. In this case we have a trace where the same state 0 repeats itself indefinitely, something that is possible since so far we added no restrictions that constrain the behaviour of mutable relations. Below the depiction of the trace, two of its consecutive states are detailed side-by-side. The concrete states that are being depicted are shown in white in the trace depiction above. In this case the two states being depicted are the first two states, which are equal. A thing to note in the depiction of the states is that, by default, immutable sets and fields (the configuration of the protocol) are shown with solid lines, while mutable ones are shown with dashed lines. The instance exploration buttons in the toolbar of the visualizer will be explained as needed. To simplify the visualisation, we can configure the theme so that node identifiers, the inbox, and the outbox are shown as node attributes, and change the color of elected nodes to green (see :ref:`theme-customisation` for an explanation of how the theme can be configured to do so). Since the elected nodes can now be easily distinguished by colour we can also hide the respective textual label by turning off the setting :guilabel:`Show as labels` in the respective signature. The result of this customisation will be as follows. .. image:: instance7.png :width: 600 :align: center .. index:: initial state Something that is clearly wrong in our instance is that in the initial state nodes already have identifiers in their inboxes and outboxes, and some of them are already elected. To restrict the initial value of these mutable fields we can add the following fact. .. code-block:: fact { // initially inbox and outbox are empty no inbox and no outbox // initially there are no elected nodes no Elected } Since this constraint has no temporal operators (more on those later) it applies to the initial state of the system. Rerunning the :code:`example` commands yields the following instance, where the systems remains indefinitely in the initial state. .. image:: instance8.png :width: 600 :align: center We can explore additional instances by using the instance exploration buttons in the toolbar. For example, the :guilabel:`New Fork` button asks the Analyzer for a different trace with the same behaviour up to the left-hand side state, but a different outcome of the current transition (a different right-hand side state). By pressing this button we get the following. .. image:: instance9.png :width: 600 :align: center This trace displays a behaviour that should not be allowed in a correct execution of the protocol: in the second state all of a sudden several identifiers appeared in the inbox and outbox of :code:`Node1`. Moreover this node was also randomly designated as leader. So far this is allowed because we added no constraints to our specification that restrict how the system can transition from one state to another. If there are no constraints then any transition is possible. .. index:: event event; guard event; effect The simplest way to constraint the valid transitions is to consider all the possible events that can occur in the system, and for each event specify when it can occur (its *guard*) and what is its *effect* on the mutable relations. In our protocol we can distinguish three possible events: * a node *initiates* the execution by sending its own identifier to the next node. * a node reads and *processes* an identifier in its inbox, decides whether it should be propagated or discarded and, if it is its own identifier, elects itself as leader. * the network *sends* one message in the outbox of a node to the next one. Again, Alloy has no special keywords to declare events. A common approach to do so is to declare a predicate for each event, that specifies its guard and effect, and add a constraint enforcing that at each possible state only one of those predicates can hold. Event predicates can be also parametrised, in particular by the node where the event occurs. .. index:: event; frame condition Let's begin by specifying the initiate event. For each event we should always start by specifying what is its guard, a constraint on the current (or pre-) state that determines when can the event occur. To simplify our specification, for the moment, we will assume that a node is free to initiate (or reinitiate) the protocol whenever it wants, so in this case there will be no guard, meaning the event is always possible. Then we should specify the effect of the event on all mutable relations of the specification: note that if nothing is said about a particular mutable relation then its value can change freely when the event occurs. A special effect is forcing the value of a mutable relation to not change at all, something known as a *frame condition*. For example in this event we will have two frame conditions, since it does not change the value of :code:`inbox` nor :code:`Elected`. To specify that a relation does not change with a logic formula we need to refer to its value in the next (or post-) state. In Alloy the value of a relation (or relational expression) in the next state is accessed by appending a prime. So, to state that, for example, the relation :code:`inbox` does not change we could add the constraint :code:`inbox' = inbox`. Likewise for :code:`Elected`. So far, the specification of event initiate looks as follows. .. code-block:: pred initiate[n : Node] { // node n initiates the protocol inbox' = inbox // frame condition on inbox Elected' = Elected // frame condition on Elected } We now need to specify the effect on relation :code:`outbox`. We can start by specifying its effect on the outbox of the initiating node. Expression :code:`n.outbox` denotes the set of identifiers in the :code:`outbox` of node :code:`n` in the current state. The effect of this event on this set is to add :code:`n`'s own identifier, which can be specified by :code:`n.outbox' = n.outbox + n.id`. Here we used the set union operator :code:`+` to add the singleton set :code:`n.id` to :code:`n.outbox`. Note that requiring :code:`n.id` to be present in the outbox in the next state, which could be specified by :code:`n.id in n.outbox'`, is not sufficient, as it would allow identifiers other than :code:`n.id` to be freely removed or added from :code:`n.outbox`. Having specified the effect on the :code:`outbox` of `n` we now need to specify the effect on the :code:`outbox` of all other nodes. Again, if nothing is said about those, they will be allowed to vary freely. Of course this event should not modify the outboxes of other nodes, something we could specify with constraint :code:`all m : Node - n | m.outbox' = m.outbox`, that quantifies over all nodes except the initiating one. The final specification of our event looks as follows. .. code-block:: pred initiate[n : Node] { // node n initiates the protocol n.outbox' = n.outbox + n.id // effect on n.outbox all m : Node - n | m.outbox' = m.outbox // effect on the outboxes of other nodes inbox' = inbox // frame condition on inbox Elected' = Elected // frame condition on Elected } Other, shorter or clearer ways to specify effects and frame conditions are presented in Section :ref:`specifying frame conditions`. In the remainder of this chapter, we will continue to use the more verbose style, but you are highly encouraged to use a terser style once it is familiar. .. index:: temporal operator; always To validate our first event we can impose a restriction stating that, for the moment, this is the only event that can occur, and inspect the possible instances to see if the system behaves as expected. To specify such a restriction we need the temporal logic operator :code:`always` to enforce a formula to be true at all possible states. In particular, to enforce that, at every possible state, one of the nodes initiates the protocol, the following fact can be added to the specification. .. code-block:: fact { // possible events always (some n : Node | initiate[n]) } Running command :code:`example` yields the following trace, where :code:`Node2` initiates the protocol and then nothing else happens. .. image:: instance10.png :width: 600 :align: center .. index:: stuttering On first glance, this may seem like an invalid trace, given that we required that at every state one node initiates the protocol. However, if we look closely at our specification of event :code:`initiate` we can see that it also holds true in a state where a node already has its own identifier in the outbox, which will remain in the outbox. Our specification did not had a guard forbidding this occurrence of the event, and the effect :code:`n.outbox' = n.outbox + n.id` is also true if :code:`n.id` is already in :code:`n.outbox` and in :code:`n.outbox'`. To address this issue we could add a guard like :code:`n.id not in n.outbox` to the specification of :code:`initiate`. However, this only forbids a node from initiating the protocol if it's own identifier is at the moment in its outbox. This means that, once we include an event to send messages and this identifier leaves the outbox, a node will be able to reinitiate the protocol. This does not pose any problem concerning the correctness of this particular protocol, but if we really wanted each node to initiate the protocol only once we would need to strengthen this guard. One possibility would be to add a field to record at which phase of the protocol execution each node is in, and allow this event only when a node is in an "uninitiated" phase. This would be the common approach in most formal specification languages. However, with Alloy we have a more direct alternative. Since there is no special language to specify events and they are specified by arbitrary logic formulas, we are free to use temporal operators to specify guards or effects. By contrast, in most formal specification languages we can only specify the relation between the pre- and post-state. .. index:: temporal operator; historically In this case we could, for example, use the temporal operator :code:`historically`, that checks if something was always true in the past up to the current state, to only allow :code:`initiate` to occur if the node's identifier was never in its outbox. The specification of our event would look as follows. .. code-block:: pred initiate[n : Node] { // node n initiates the protocol historically n.id not in n.outbox // guard n.outbox' = n.outbox + n.id // effect on n.outbox all m : Node - n | m.outbox' = m.outbox // effect on the outboxes of other nodes inbox' = inbox // frame condition on inbox Elected' = Elected // frame condition on Elected } Unfortunately, after adding this guard our command will return no trace, meaning there are no behaviours that satisfy our constraints. This happens because we are currently not allowing messages to be sent or read, so after all nodes have initiated the protocol nothing else is allowed to happen by our specification, and it is impossible to obtain a valid trace. Recall that traces are infinite sequences of states, where the system is always evolving. A simple way to solve this issue is to add an event allowing the system to *stutter*, that is to do nothing (keeping the value of all relations unchanged). Stuttering can also be understood as an event specifying something else that is occurring on the environment outside the scope of our specification, and in general it is a good idea to add such stuttering events. In this case, a predicate to specify a stuttering event can be specified as follows. .. code-block:: pred stutter { outbox' = outbox inbox' = inbox Elected' = Elected } To allow this event we need to change the above fact that restricts the possible behaviours. .. code-block:: fact { // possible events always (stutter or some n : Node | initiate[n]) } .. index:: Alloy Analyzer; enumeration Running command :code:`example` returns a trace where nothing happens in our system, something that is now allowed. If we press :guilabel:`New Fork` we might get a trace such as the previously depicted, where :code:`Node2` initiates the protocol and then nothing else happens. If we move to the next state by pressing :guilabel:`→`, and fork again with :guilabel:`New Fork`, we might get the following trace where :code:`Node0` initiates the protocol after :code:`Node2`. .. image:: instance11.png :width: 600 :align: center By exploring a bit more the possible traces we can get some confidence that event :code:`initiate` is well specified. Let us now specify the :code:`send` event. In this event we could have as parameters the node :code:`n` from where the message will be sent and the message :code:`i` that will be sent (an identifier). The guard of this event should require :code:`i` to be in the :code:`outbox` of :code:`n`. The effect will be to remove :code:`i` from the :code:`outbox` of :code:`n`, and add it to the :code:`inbox` of the next node :code:`n.succ`. All other inboxes and outboxes should keep their value, as well as the :code:`Elected` set. The specification of this event would be the following. .. code-block:: pred send[n : Node, i : Id] { // i is sent from node n to its successor i in n.outbox // guard n.outbox' = n.outbox - i // effect on n.outbox all m : Node - n | m.outbox' = m.outbox // effect on the outboxes of other nodes n.succ.inbox' = n.succ.inbox + i // effect on n.succ.inbox all m : Node - n.succ | m.inbox' = m.inbox // effect on the inboxes of other nodes Elected' = Elected // frame condition on Elected } .. index:: logical operator; else For event :code:`process` again we could have as parameters the node :code:`n` and the message :code:`i` to be read and processed. The guard should require that :code:`i` is in the :code:`inbox` of :code:`n`. The effect on :code:`inbox` is obvious: :code:`i` should be removed from the :code:`inbox` of :code:`n` and all other inboxes should keep their value. The effect on the :code:`outbox` of :code:`n` depends on the whether :code:`i` is greater than :code:`n.id`: if so, :code:`i` should be added to the :code:`outbox`, to be later propagated along the ring; if not, it should not be propagated, meaning the :code:`outbox` of :code:`n` will keep its value. To write such conditional outcome we could use the logical operator :code:`implies` together with an :code:`else`: :code:`C implies A else B` is the same as :code:`(C and A) or (not C and B)`, but the former is easier to understand. The event has no effect on the outboxes of other nodes, which should all keep their value. The same conditional outcome applies to :code:`Elected`: if the received :code:`i` is equal to the :code:`n`'s own identifier, then :code:`n` should be included in :code:`Elected`; else :code:`Elected` keeps its value. The full specification of :code:`process` is as follows. .. code-block:: pred process[n : Node, i : Id] { // i is read and processed by node n i in n.inbox // guard n.inbox' = n.inbox - i // effect on n.inbox all m : Node - n | m.inbox' = m.inbox // effect on the inboxes of other nodes gt[i,n.id] implies n.outbox' = n.outbox + i // effect on n.outbox else n.outbox' = n.outbox all m : Node - n | m.outbox' = m.outbox // effect on the outboxes of other nodes i = n.id implies Elected' = Elected + n // effect on Elected else Elected' = Elected } We should now add these two events to the fact that constraints the valid behaviours of our system. .. code-block:: fact { // possible events always ( stutter or some n : Node | initiate[n] or some n : Node, i : Id | send[n,i] or some n : Node, i : Id | read[n,i] ) } .. index:: instance; exploration Before proceeding to the verification of the expected properties of the protocol, we should play around with the different instance exploration buttons to explore different execution scenarios and validate the specification of our events. An alternative would be to ask for specific scenarios directly in a :code:`run` command. For example, we could change command :code:`example` to ask directly for a scenario where some node will be elected. To do so we need to use the temporal operator :code:`eventually`, which checks if a formula is valid at some point in the future (including the present state). Our command would look as follows. .. code-block:: run example { eventually some Elected } for exactly 3 Node, exactly 3 Id Running this command will return a trace with 8 states where the highest identifier is passed around until the respective node gets elected. This is the shortest trace where a leader can be elected in a ring with 3 nodes, corresponding to one initiate event followed by 3 interleaved send and read events. The Analyzer guarantees that the shortest traces that satisfy (or refute) a property are returned first. The first transition of this trace is the following, where the node with the highest identifier initiates the protocol. .. image:: instance12.png :width: 600 :align: center In the last transition, that can be focused by pressing :guilabel:`→` 6 times or by directly clicking the state numbered 6 in the trace depiction, the node with the highest identifier reads its own identifier that was passed around in the ring and elects himself as leader. .. image:: instance13.png :width: 600 :align: center Verifying the expected properties of the protocol ================================================= The key property of our protocol is that eventually exactly one node will become leader. We can break this property into to simpler ones: - There will never be more than one leader - Eventually there will be at least one leader .. index:: safety liveness fairness These properties are of very different nature: the former is a *safety* property, forbidding some (undesired) behaviour of the system, while the latter is a *liveness* property, forcing some (desirable) behaviour of the system. The analysis of safety properties is usually simpler than the analysis of liveness properties. To find a counter-example for a safety property it suffices to search for a finite sequence of states that leads to a (bad) state (in this case one where there are two or more leaders), and it is irrelevant what happens afterwards, as any continuation of this finite sequence will still be a counter-example. On the other hand, to find a counter-example for a liveness property it is necessary to search for a complete infinite trace where the expected behaviour definitely never happened (in this case a leader is never elected). Moreover, it will be necessary to impose additional *fairness* conditions when verifying liveness properties, in particular to forbid bogus counter-examples where at some point the system stutters forever and thus the expected behaviour never happens. .. index:: invariant assert command; check Let us start by verifying the first (safety) property. Recall that properties that are expected to hold should be written in named assertions (declared with keyword :code:`assert`) and then verified with :code:`check` commands. This safety property is a very simple example of an *invariant*, a property that requires something to be be true in all states of all possible traces. In Alloy, invariants can be specified using the temporal operator :code:`always`. In each state, to check that there is at most one leader in set :code:`Elected` we could use the keyword :code:`lone`. This invariant could thus be specified as follows. .. code-block:: assert AtMostOneLeader { always (lone Elected) } To check this assertion we could define the following command. .. code-block:: check AtMostOneLeader .. index:: model checking; bounded model checking scopes; steps This protocol is known to be correct, so, as expected, running this command returns no counter-example. Recall that the default scope on top-level signatures is 3, so this command verifies in a single execution that the property holds for all rings with up to three nodes. Moreover, by default, verification of temporal properties is done with a technique known as *bounded model checking*, meaning that the search for counter-examples will only consider a given maximum number of different transitions before the system starts exhibiting a repeated behaviour. By default this maximum number is 10. To increase our confidence in the result of the analysis we could, for example, check this property for all rings with up to 4 nodes and consider up to 20 different transitions (or steps). To do so we could change the scope of the command as follows. .. code-block:: check AtMostOneLeader 4 but 20 steps This commands sets the default scope on signatures to 4 but also changes the scope on transitions to 20 using keyword :code:`steps`. Again, as expected, no counter-example is returned, but the analysis now takes considerably longer, as there are many more network configurations and states to explore. .. index:: model checking; complete model checking It is also possible to check a temporal property with *unbounded model checking*, taking into consideration an arbitrary number of transitions, but still with the signatures bounded by a maximum scope. To do so, a special scope on :code:`steps` is needed to trigger unbounded temporal analysis. .. code-block:: check AtMostOneLeader 4 but 1.. steps .. index:: Alloy Analyzer; options solver However, for such analysis to be performed it is necessary to select a solver in the :menuselection:`Options --> Solver` menu that supports it. By default, the Alloy Analyzer is not pre-packaged with solvers that support unbounded model checking. You first need to install either the `NuSMV `_ or the `nuXmv `_ model checker, and then select :code:`Electrod/NuSMV` or :code:`Electrod/nuXmv` as solver, respectively. The Analyzer shows that the safety property holds even with an arbitrary number of transitions. The second property is also a very simple example of a liveness requirement, namely one that requires something to hold at least in one state of every trace. Such properties can be specified directly with the temporal operator :code:`eventually`. Inside this temporal operator, again a very simple cardinality check on :code:`Elected` suffices, this time with keyword :code:`some`. .. code-block:: assert AtLeastOneLeader { eventually (some Elected) } .. index:: command; check The following command can be used to check this property with bounded model checking and the default scopes. .. code-block:: check AtLeastOneLeader Both :code:`Electrod/NuSMV` and :code:`Electrod/nuXmv` can be used to check this command, since they also support bounded model checking. Alternatively, we can also choose one of the SAT solvers pre-packaged with the Analyzer in the :menuselection:`Options --> Solver` menu, for example the `SAT4J `_ solver. Feel free to experiment with different solvers when analysing a property, as the performance can sometimes vary significantly just by choosing a different solver. We recommend that you start by verifying specifications with bounded model checking, because it is usually much faster than the unbounded counterpart, and most counter-examples tend to require only a handful of transitions. Once you have a design where no counter-example is returned with bounded model checking, you can increase your confidence on the analysis by using unbounded model checking. As hinted above, running this command immediately returns a counter-example where the system stutters forever, and obviously no leader is elected. .. image:: instance14.png :width: 600 :align: center .. index:: event; enabled This behaviour is allowed because we added a stutter event that captures events external to our system. At the moment this event can occur indefinitely, which is a bit unrealistic or *unfair* to the system under analysis: for example, if at some point the system is continuously ready to execute an event (the event is continuously *enabled*), then it should eventually occur. For example, in our protocol, a trace where at some point a message is in the inbox of a node but is never read is unfair to that node. To specify such fairness properties we first need to understand two well-known combinations of the :code:`always` and :code:`eventually` temporal operators: - :code:`eventually always p` holds in a system if all traces reach a state where :code:`p` becomes valid indefinitely. - :code:`always eventually p` holds in a system if in every state of every trace :code:`p` is later valid, which means that :code:`p` is true infinitely often. To specify that an event is at some point continuously enabled we can use the first combination. But first we need to specify when is a particular event enabled (ready to occur). Most of the times this condition is exactly the same as the guard of the event, but in some events there can be additional conditions that derive from its effect. In our protocol, the enabled condition for all events is the same as the guard. For example, given a node :code:`n` and identifier :code:`i`, event :code:`process[n,i]` is enabled when :code:`i in n.inbox`. So, to specify that event :code:`process[n,i]` becomes continuously enabled at some point we could write :code:`eventually always i in n.inbox`. However, if, for example, the outbox of a node was limited to contain at most one identifier, then we would have additional enabling conditions, as the effect :code:`n.outbox' = n.outbox + i` that occurs when :code:`gt[i,n.id]` would also require that :code:`n.outbox` is empty. To specify that a continuously enabled event eventually occurs (after the point it becomes enabled) the second combination of operators presented above can be used. Note that a continuously enabled event would still be (continuously) enabled after the required occurrence of the event, which means that the event should occur again and again, or infinitely often. So, the desired fairness condition for event :code:`process[n,i]` can be specified as follows. .. code-block:: (eventually always i in n.inbox) implies (always eventually process[n,i]) .. index:: fairness; weak fairness; strong Technically, this fairness condition is known as *weak fairness*, as the event is only required to happen if continuously enabled. Often this suffices to verify most liveness properties. However, sometimes we may need *strong fairness* constraints, requiring that the event occurs when it becomes recurrently enabled (infinitely often, but not necessarily continuously). For event :code:`process[n,i]` such strong fairness constraint would be specified as follows. .. code-block:: (always eventually i in n.inbox) implies (always eventually process[n,i]) In our example property, the liveness property holds if we require weak fairness for all instances of three events of the system, as specified in the following predicate. .. code-block:: pred fairness { all n : Node, i : Id { (eventually always historically n.id not in n.outbox) implies (always eventually initiate[n]) (eventually always i in n.inbox) implies (always eventually process[n,i]) (eventually always i in n.outbox) implies (always eventually send[n,i]) } } Finally, if we change the :code:`AtLeastOneLeader` assertion to assume fairness, no counter-example will be returned even with increased scopes, as expected. .. code-block:: assert AtLeastOneLeader { fairness implies eventually (some Elected) } To better understand fairness constraints, suppose we commented out the fairness restriction on event :code:`process`. .. code-block:: pred fairness { all n : Node, i : Id { (eventually always historically n.id not in n.outbox) implies (always eventually initiate[n]) -- (eventually always i in n.inbox) implies (always eventually process[n,i]) (eventually always i in n.outbox) implies (always eventually send[n,i]) } } The :code:`check AtLeastOneLeader` command would immediately return a counter-example in a configuration with a single node, that initiates the protocol, sends its own identifier to himself, and then never processes that identifier (something that would be allowed at this point, since neither :code:`initiate` nor :code:`read` are enabled). The last two states before the trace starts stuttering are shown below. .. image:: instance15.png :width: 600 :align: center This trace is fair to :code:`initiate` and :code:`send`, but not to :code:`read`, which is always enabled in the stuttering part of the trace but never occurs. .. warning:: Avoid fairness conditions in facts! Instead of a predicate we could have specified the fairness conditions inside a :code:`fact`, to force them to be true in all analyses. However, in general that is not a good idea, since :code:`run` commands would then only return fair traces, which hinders the interactive exploration of scenarios and requires much larger bounds on the number of steps (making the analysis slower). For example, command .. code-block:: run {} for exactly 3 Node, exactly 3 Id that asks for any trace in a ring with 3 nodes will not even return a trace with the default scope of 10 :code:`steps`, because no fair trace with this size exists for this configuration. The minimum scope would be 14 :code:`steps`! This also means that a command like :code:`check AtMostOneLeader`, to verify the safety property of the protocol, would be mostly pointless, as with the default scope of 10 :code:`steps` only configurations with up to 2 nodes would be checked for correctness. Without fairness imposed in a fact, it would at least check that the safety property is not violated in the first 10 steps of any trace of the protocol, including unfair and fair ones. Recall that in a counter-example of a safety property it is irrelevant what happens after we reach a faulty state: any continuation of the "bad" trace prefix would still be a counter-example, including any fair continuation. Making the specification more abstract ====================================== .. caution:: This section is still under construction .. code-block:: open util/ordering[Node] sig Node { succ : one Node, var inbox : set Node } fact { // at least one node some Node } fact { // succ forms a ring all n : Node | Node in n.^succ } fact { // initially inbox is empty no inbox } fun Elected : set Node { { n : Node | n not in n.inbox and once (n in n.inbox) } } pred initiate[n : Node] { // node n initiates the protocol historically n not in n.succ.inbox // guard n.succ.inbox' = n.succ.inbox + n // effect on n.succ.inbox all m : Node - n.succ | m.inbox' = m.inbox // effect on the inboxes of other nodes } pred process[n : Node, i : Node] { // i is read and processed by node n i in n.inbox // guard n.inbox' = n.inbox - i // effect on n.inbox gt[i,n] implies n.succ.inbox' = n.succ.inbox + i // effect on n.succ.inbox else n.succ.inbox' = n.succ.inbox all m : Node - n.succ - n | m.inbox' = m.inbox // effect on the inboxes of other nodes } pred stutter { inbox' = inbox } fact { // possible events always ( stutter or some n : Node | initiate[n] or some n : Node, i : Node | process[n,i] ) } run example { eventually some Elected } for 3 Node assert AtMostOneLeader { always (lone Elected) } check AtMostOneLeader for 5 but 1.. steps pred fairness { all n : Node, i : Node { (eventually always historically n not in n.succ.inbox) implies (always eventually initiate[n]) (eventually always i in n.inbox) implies (always eventually process[n,i]) } } assert AtLeastOneLeader { fairness implies eventually (some Elected) } check AtLeastOneLeader for 3 but 1.. steps Exercises ========= #. To practice temporal logic, access the following Alloy4Fun model and specify all the listed constraints. Each constraint is specified in a separate predicate, and for each there is a command that can be used to check its correctness. If the constraint is not correctly specified a counter-example will be shown. The counter-example will include an atom that clarifies if it is an example that the incorrect constraint is currently rejecting but should be accepted or one that is currently being accepted but should be rejected. Unlike in the Alloy Analyzer, in Alloy4Fun only one state of the trace is shown at a time. Use the green arrows to navigate back and forward in the trace. In the last state of the trace before looping back the forward arrow is circle shaped. a. `Messaging system `_ b. `Table reservations `_ #. Specify `Dijkstra's self-stabilizing mutex algorithm `_, that works on arbitrary rings of :code:`N+1` machines, with a distinguished *bottom* machine. In particular, specify the solution with :code:`K` state machines, both with and without a central daemon. The central daemon ensures that only one machine with the *privilege* executes at each time, otherwise an arbitrary number of machines can execute. The rules for determining if a machine has the privilege to execute are as follows: the bottom machine has the privilege if its state is equal to that of its neighbor; a non-bottom machine has the privilege if its state is different from that of its neighbor. When the bottom machine executes it increments its state modulo :code:`K`. When a non-bottom machine executes it just copies the state of its neighbor. The goal is to verify the expected *convergence* and *closure* properties. Convergence is a liveness property that ensures that whatever the initial state, the system will converge to a *legitimate* global state. In this case, a legitimate global state is one where exactly one machine has the privilege to execute and where every machine will eventually obtain that privilege. Closure is a safety property that ensures that after reaching a legitimate global state, the system will forever stay in a legitimate global state. According to the paper, without a central daemon both convergence and closure hold when :code:`K > N`, while with a central daemon it suffices that :code:`K >= N`. Use the Alloy Analyzer to verify if that is indeed the case. Notice that instead of using integers to model the local states of the machine, a ring of :code:`K` values can be used. #. Specify `Chang's echo distributed protocol for forming a spanning tree `_ in an arbitrary connected network with a distinguished *initiator*. In particular, specify *Algorithm 0* described in the paper, that works roughly as follows. The initiator starts by sending an *explorer* message to all its neighbors. Upon receiving a message, each node acts as follows: - If the message is an explorer and it is the first to arrive at the node, mark the sender as the *first* neighbor and send an explorer to all neighbors except the first. Note that it is assumed that the initiator is considered to have been explored *a priori*. - If the message is an explorer but is not the first to arrive or there are no neighbors except the first, send an *echo* message back to the sender. - If the message is an echo, register that an echo as been received from the sender, and if echos from all neighbors (to which explorers were sent) have arrived send an echo to the first neighbor, unless the node is the initiator, in which case the protocol is finished. Verify the expected properties of the protocol, namely that it will eventually finish and that when it finishes the *first* relationship defines a spanning tree of the network rooted at the initiator.