.. _protocol-design:

Protocol design with Alloy
**************************

.. index::
   distributed protocol; leader election


In this chapter we will explain how to specify and analyse a
distributed protocol using Alloy. We will use a very simple example of
a leader election protocol. The aim of a leader election protocol is
to designate one node in a network as the *organizer* or *leader* of
some distributed task. Nodes are initially unaware of which node will
play that role, but after running the protocol all nodes should
recognise the same node as leader. Many different leader election
protocols exist: here we will use a well-known version that can be
used to elect a leader in a network of nodes organised in a ring, and
where nodes also have unique identifiers. This is an example commonly
used when presenting idioms for behaviour specification in Alloy
[MIT12]_. This protocol was proposed by Chang and Roberts [CACM79]_
and roughly operates as follows: the goal is to elect as leader the
node with the highest identifier; each node starts by sending its own
identifier clockwise; upon receiving an identifier, a node only
forwards it if it is higher than its own; a node receiving back its
own identifier designates itself as leader; finally the leader
broadcasts its newly found role to the network. In our specification
we will not model this last (trivial) step, and will only be concerned
with the fundamental property of a leader election protocol:
eventually exactly one (exactly one) node will designate itself as the
leader.

Specifying the network configuration
====================================

When designing a distributed protocol, we should start by specifying
the network configurations in which it is supposed to operate. In this
case, the protocol operates in a ring network of nodes with unique
identifiers. Notice that many different network configurations satisfy
this requirement: we could have rings of different size, and for the
same size, different assignments of the unique identifiers to the
nodes (the relevant fact here is the relative order in which the
identifiers appear in the network). The verification of the expected
properties of the protocol should take into account all possible
different configurations (up to the specified bound on the ring
size). The network configuration, although arbitrary, does not change
during the execution of the protocol. As such, it will be
specified with immutable sets and relations.

Unlike some formal specification languages, Alloy has no
pre-defined notion of *process* or *node*. These have to be explicitly
specified, by declaring a signature. Since the network of nodes forms
a ring, a binary relation that associates each node with its
immediate successor in the ring should also be declared.

.. code-block::

   sig Node {
     succ : one Node
   }

.. index::
   command; run

Recall that binary relations should be declared as fields inside the
domain signature. In the declaration of field :code:`succ` the
multiplicity :code:`one` is used to ensure that each node has exactly
one successor in the ring.  As usual in Alloy, we can immediately start
validating our specification by asking for example instances using a
:code:`run` command.
   
.. code-block::

   run example {}

By iterating over the returned instances with :guilabel:`New` we could
get the following instance, which is not a valid ring.
   
.. image:: instance1.png
   :width: 400
   :align: center

In addition to requiring each node to have exactly one successor, a simple way to ensure that a network forms a ring is to require that
every node is reachable from any other node. Given a node :code:`n`,
to determine the set of nodes reachable from :code:`n` via relation
:code:`succ` we can use expression
:code:`n.^succ`. This expression makes use of the composition (:code:`.`) and transitive closure (:code:`^`) operators already presented in chapter :ref:`structural-design`. The constraint that :code:`succ` forms a ring can thus be specified in a fact as follows.

.. code-block::

   fact {
     // succ forms a ring
     all n,m : Node | m in n.^succ
   }
      
An alternative specification is to require that the set of all nodes
(denoted by :code:`Node`) is reachable from every node.

.. code-block::

   fact {
     // succ forms a ring
     all n : Node | Node in n.^succ
   }

If we now iterate over all instances returned by command :code:`example` we will get the 3 possible rings with up to 3 nodes and a configuration without any node, which is also allowed by our specification. The limit of 3 is due to the implicit command scope of 3 atoms per top-level signature. As an example we show the ring with 3 nodes.

.. image:: instance2.png
   :width: 400
   :align: center

To enforce that we have at least one node we can add the following fact.

.. code-block::

   fact {
     // at least one node
     some Node
   }

.. index::
   total order
   
Besides being organised in a ring, nodes must also have unique identifiers. Since we are going to need to compare identifiers, one possibility would be to just use integers to represent them. However, besides being totally ordered, integers carry additional semantics that is not necessary here. In particular, integers support arithmetic operations that will not be required in this example. When developing a formal specification it is good practice to be as abstract as possible, in particular avoid adding unnecessary restrictions. Also, the analysis of Alloy specifications with integers has some subtleties which makes it better to avoid, unless strictly necessary (see :ref:`integers` for a proper discussion). As such, we will introduce a signature :code:`Id`, to denote identifiers, and a binary relation :code:`next` to capture the total order between them.

.. code-block::

   sig Id {
     next : lone Id
   }

Given an identifier, relation :code:`next` shall yield the next identifier in the total order. All identifiers except the last one have a :code:`next`, so the multiplicity of this relation should be :code:`lone`.

.. index::
   relational operator; reflexive transitive closure

To constrain :code:`next` to entail a total order over :code:`Id` we could, for example, declare singleton signatures to denote the :code:`first` and :code:`last` identifiers, require the :code:`last` to have no :code:`next`, and all identifiers to be reachable from the :code:`first` via the (reflexive) transitive closure of :code:`next`.

.. code-block::

   one sig first, last in Id {}

   fact {
     // Id is totally ordered with next
     no last.next
     Id in first.*next
   }

.. index::
   signature; subset signature
   signature; sub-signature
   
Here we used :code:`in` instead of :code:`extends` to declare the singleton sets :code:`first` and :code:`last`. Recall that extension signatures are disjoint, while inclusion signatures are arbitrary subsets of the parent signature. In this case we what the latter because when we have just a single :code:`Id`, :code:`first` and :code:`last` should be the same identifier. In the last constraint we used operator :code:`*` to compute the reflexive transitive closure of :code:`next`, to get all identifiers that are reachable in zero or more steps via :code:`next` (the transitive closure would yield all identifiers reachable in one or more steps). The reflexive transitive closure is defined as :code:`*next = ^next + iden`, the union of the transitive closure and the predefined identify binary relation, that relates each element of our universe of discourse to itself. 

.. index::
   ! constant
   ! constant; iden
   ! constant; univ
   ! constant; none
   relational operator; Cartesian product
   relational operator; domain restriction
   relational operator; range restriction
   type system; arity error

.. warning:: Understanding the predefined Alloy constants.

   There are three predefined (relational) constants in Alloy:

   * :code:`none`, that denotes the empty set.
   * :code:`univ`, the set of all atoms in all signatures.
   * :code:`iden`, the binary relation that associates each atom in :code:`univ` to itself.

   It is important to know that when there are mutable top-level
   signatures (declared with :code:`var`), both :code:`univ` and
   :code:`iden` are mutable also, with the former containing, in each
   state, all atoms contained in all signatures at that state.
     
   It is also important to remember that :code:`iden` relates every element of :code:`univ` to itself. This means that, for example, to check that a binary relation :code:`R` (defined on a signature :code:`A`) is reflexive it is not correct to write :code:`iden in R`, as :code:`iden` will not only contain elements in :code:`A` but also elements in other declared top-level signatures. To obtain the identity relation on signature :code:`A` we could, for example, intersect it with the binary relation containing all pairs of elements drawn from :code:`A`, computed with the Cartesian product operator :code:`->`, as follows: :code:`iden & (A -> A)`. Alloy also has an operator to restrict the domain of a relation that can be used for the same effect: :code:`A <: iden` filters relation :code:`iden` to obtain only the pairs whose first element belongs to :code:`A`. Dually, we could have restricted the range: :code:`iden :> A` filters relation :code:`iden` to obtain only the pairs whose last element belongs to :code:`A`. Thus, to check that the relation :code:`R` is reflexive we could write, for example, :code:`(A <: iden) in R`.

   Another thing to recall is that :code:`none` denotes the empty set and is a unary relation. So, if we write :code:`R = none`, to check if :code:`R` is empty, we will get a type error since the relations have different arities (:code:`none` is unary and :code:`R` is binary). The empty binary relation is denoted by the Cartesian product :code:`none -> none`: this expression denotes the empty set like :code:`none`, but for the type system it is a binary relation, so writing :code:`R = (none -> none)` does not raise a type error. Of course, a simpler way to check that :code:`R` is empty is to write :code:`no R`: the cardinality check keywords like :code:`no` can be used with relational expressions of arbitrary arity.

   
Having our identifiers totally ordered, we can now declare a field :code:`id` inside signature :code:`Node`, to associate each node with its identifier.

.. code-block::

   sig Node {
     succ : one Node,
     id : one Id
   }


To ensure that identifiers are unique it suffices to require :code:`id` to be injective, which can be done as follows.

.. code-block::

   fact {
     // ids are unique
     all i : Id | lone id.i
   }
   
If we change the :code:`example` command to ask for all configurations with exactly 3 nodes and 3 identifiers


.. code-block::

   run example {} for exactly 3 Node, exactly 3 Id

.. index::
   symmetry breaking
   
and iterate over all instances with :guilabel:`New`, we will get the two only (truly) different configurations with this size: one where the identifiers increase as we move clockwise in the ring, and another where they decrease. The former is depicted below. This illustrates the power of Alloy's symmetry breaking, which in this case prevents the return of any other instance (that necessarily would be isomorphic to one of these two).

.. image:: instance3.png
   :width: 400
   :align: center


Using modules to structure a specification
==========================================


.. index::
   module 

Requiring a signature to be totally ordered is quite common.
To simplify the specification of such common patterns, and also to enable breaking a large specification into more manageable pieces, Alloy supports the concept of *module*, which behaves similarly to modules in other specification and programming languages. It is possible to create a module with some definitions and then import it in another module. For example we could define a module for totally ordered identifiers as follows.

.. code-block::

   module orderedids

   sig Id {
     next : lone Id
   }
   one sig first, last in Id {}

   fact {
     // Id is totally ordered with next
     no last.next
     Id in first.*next
   }

The first line in a module consists of the keyword :code:`module`
followed by the module name. Module names should match the path of the
respective file, so we should save it the with name :code:`orderedids.als` and be in the same directory as the module that will import it. Having factored out these declarations into module :code:`orderedids`,
we can now delete them from our main module, and instead import this new module by adding the following :code:`open` statement at the beginning of the specification.

.. code-block::

   open orderedids

.. index::
   module; parameterised
   
Module :code:`orderedids` is not ideal, since it commits to a specific signature name to be totally ordered. To make it more reusable, we could instead parametrise it by the signature one wishes to be totally ordered. This poses a problem, because our formalisation of a total order requires declaring field :code:`next` inside the signature to be totally ordered, and modules cannot change the definition of the parameter signature. There are some tricks to solve this problem. One possibility is to declare an auxiliary inclusion signature inside the parameter signature, force it to be exactly the same as the parent signature using a :code:`fact`, and declare relation :code:`next` inside that auxiliary signature instead. The definition of our parametrised module (now called just :code:`ordered`) would look as follows.

.. code-block::

   module ordered[A]

   sig Aux in A {
     next : lone Aux
   }
   fact {
     Aux = A
   }
   one sig first, last in A {}

   fact {
     // A is totally ordered with next
     no last.next
     A in first.*next
   }

To use this parametrised module, we should declare back signature :code:`Id` in our main specification, and pass it as a parameter in the :code:`open` statement. The main specification should now look as follows.

.. code-block::

   open ordered[Id]

   sig Node {
     succ : one Node,
     id : one Id
   }
   
   sig Id {}

   fact {
     // at least one node
     some Node
   }

   fact {
     // succ forms a ring
     all n : Node | Node in n.^succ
   }

   fact {
     // ids are unique
     all i : Id | lone id.i
   }

   run example {} for exactly 3 Node, exactly 3 Id

.. index::
   module; predefined
   module; ordering

Alloy has some predefined modules in a special :code:`util` namespace. One of them is :code:`util/ordering`, a parametrised module that behaves similarly to our module :code:`ordered`, imposing a total order on a signature passed as parameter. This means that instead of defining our :code:`ordered` module we could simply import this module with the statement

.. code-block::

   open util/ordering[Id]

This predefined module is specified in a slightly different way but, as :code:`ordered`, it exposes singleton sets :code:`first` and :code:`last` and a binary relation :code:`next` that defines the total order. However, if we run the :code:`example` command we get the following result.

.. image:: instance4.png
   :width: 400
   :align: center

Due to the way they are defined in :code:`util/ordering`, relations :code:`first`, :code:`last`, and :code:`next` are not shown in the visualizer. In principle, this would make it very difficult to understand how the identifiers are ordered, but module :code:`util/ordering` is handled in special way be the Alloy Analyzer, which will attempt to name the inhabitants of the ordered signature sequentially according to the order defined by :code:`next`. In our instance this means that :code:`Id0` is the :code:`first`, which is followed by :code:`Id1`, and then :code:`Id2`, which is the :code:`last`. If we really want to confirm the value of the :code:`next` relation we could go to the evaluator and ask for its value. However, we will get an ambiguity error since :code:`next` is also defined in an integer module that is imported by default. To disambiguate we could type the qualified name :code:`ordering/next` instead.


.. image:: evaluator1.png
   :width: 500
   :align: center

To see what other functions and derived relations are declared in :code:`util/ordering` you can open it in the menu :menuselection:`File --> Open Sample Models...`. For example, we have functions :code:`gt` and :code:`lt` to check if one element is greater than or less than another, respectively, and relation :code:`prev` that returns the predecessor of an element, if it exists. 


.. warning:: Be aware of exact scopes with :code:`util/ordering`!

   Besides naming the atoms according to the total order, module :code:`util/ordering` has another peculiarity concerning the analysis. The scope of the signature that is passed as parameter to :code:`util/ordering` is always exact even if not declared with :code:`exactly`. This makes analysis faster but one should always be aware of this when running commands, as it may cause :code:`run` commands to be trivially unsatisfiable and :code:`check` commands to trivially have no counter-examples.

   Consider the following specification, where we have model a finite set of natural numbers, distinguish between even and odd numbers, and require the first to be even and the last to be odd.

   .. code-block::

      open util/ordering[Nat]

      abstract sig Nat {}
      sig Even, Odd extends Nat {}

      fact {
        all n : Even | n.next in Odd
	all n : Odd | n.next in Even
	first in Even
	last in Odd
      }

      run example {}

   If we execute the command :code:`example` we get no instance! This is due to the fact that the default scope for :code:`Nat` is 3, and since it is being totally ordered with :code:`util/ordering` this scope is required to be exact. Unfortunately, with exactly 3 atoms in :code:`Nat` it is impossible to satisfy the given constraints. If we change the scope of :code:`example` to 4 then we will obtain one instance.

   .. image:: instance5.png
      :width: 400
      :align: center

   This is an example where the naming of the atoms does not help in understanding what is the total order, since the ordered signature :code:`Nat` is abstract and extended by :code:`Even` and :code:`Odd`. By using the evaluator we can see that the total order is :code:`Even0`, :code:`Odd1`, :code:`Even1`, and lastly :code:`Odd0`.

   .. image:: evaluator2.png
      :width: 500
      :align: center

   As a last note on scopes, for efficiency reasons the :code:`util/ordering` module also requires the parameter signature to be non-empty (since it forces :code:`first` and :code:`last` to always be defined), meaning that setting its scope to 0 results in an inconsistent specification.


.. _specifying-protocol-dynamics: 

Specifying the protocol dynamics
================================

.. index::
   fields; field declaration
   fields; mutable fields
   var keyword; mutable fields

The coordination between the different nodes in this distributed protocol, like in many others, is achieved by means of message passing. Like nodes, Alloy has no special support for message passing, so all the relevant concepts have to be specified from scratch. In this case messages are just node identifiers, so there is no need to add a signature to model those. To capture the incoming and outgoing messages we can declare mutable binary fields inside signature :code:`Node`, to associate each node with the set of identifiers it has received or that need to be sent, respectively. To declare a mutable field the keyword :code:`var` should be used.

.. code-block::

   sig Node {
     succ : one Node,
     id : one Id,
     var inbox : set Id,
     var outbox : set Id
   }

We also need to capture the nodes that have been elected as leaders. To do so we can declare a mutable signature :code:`Elected` that is a subset of :code:`Node`.

.. code-block::

   var sig Elected in Node {}

   
If we now run the :code:`example` command we might get the following instance.
   
   .. image:: instance6.png
      :width: 700
      :align: center

.. index::
   Alloy Analyzer; visualizer
   trace

Once a specification has mutable elements, its instances take the shape of traces. The first thing to be noticed is that the interface of the visualizer changed to adapt to this fact: it now shows two instance snapshots side-by-side, there are several new instance exploration buttons in the toolbar, and a depiction of a path just below those. This new interface is due to the addition of mutable relations in our specification. As expected, the value of a mutable relation may change over time, so an instance of our specification can no longer be captured by a single static valuation of the declared signatures and fields. Instead, instances are now infinite sequences of states (aka paths or traces), being each state a (possibly different) valuation for the declared signatures and fields. As such, the visualizer now shows a (finite) depiction of the (infinite) trace, numbering the different states, and including a loop back that marks a segment of the trace that will repeat indefinitely. In this case we have a trace where the same state 0 repeats itself indefinitely, something that is possible since so far we added no restrictions that constrain the behaviour of mutable relations. Below the depiction of the trace, two of its consecutive states are detailed side-by-side. The concrete states that are being depicted are shown in white in the trace depiction above. In this case the two states being depicted are the first two states, which are equal. A thing to note in the depiction of the states is that, by default, immutable sets and fields (the configuration of the protocol) are shown with solid lines, while mutable ones are shown with dashed lines. The instance exploration buttons in the toolbar of the visualizer will be explained as needed.

To simplify the visualisation, we can configure the theme so that node identifiers, the inbox, and the outbox are shown as node attributes, and change the color of elected nodes to green (see :ref:`theme-customisation` for an explanation of how the theme can be configured to do so). Since the elected nodes can now be easily distinguished by colour we can also hide the respective textual label by turning off the setting :guilabel:`Show as labels` in the respective signature. The result of this customisation will be as follows.

   .. image:: instance7.png
      :width: 600
      :align: center

.. index::
   initial state

Something that is clearly wrong in our instance is that in the initial
state nodes already have identifiers in their inboxes and outboxes, and some of them are already elected. To restrict the initial value of these mutable fields we can add the following fact.

.. code-block::
   
   fact {
     // initially inbox and outbox are empty
     no inbox and no outbox
     // initially there are no elected nodes
     no Elected
   }

Since this constraint has no temporal operators (more on those later)
it applies to the initial state of the system. Rerunning the
:code:`example` commands yields the following instance, where the
systems remains indefinitely in the initial state.
              
   .. image:: instance8.png
      :width: 600
      :align: center

We can explore additional instances by using the instance exploration buttons in the toolbar. For example, the :guilabel:`New Fork` button asks the Analyzer for a different trace with the same behaviour up to the left-hand side state, but a different outcome of the current transition (a different right-hand side state). By pressing this button we get the following.

   .. image:: instance9.png
      :width: 600
      :align: center

This trace displays a behaviour that should not be allowed in a correct execution of the protocol: in the second state all of a sudden several identifiers appeared in the inbox and outbox of :code:`Node1`. Moreover this node was also randomly designated as leader. So far this is allowed because we added no constraints to our specification that restrict how the system can transition from one state to another. If there are no constraints then any transition is possible.

.. index::
   event
   event; guard
   event; effect

The simplest way to constraint the valid transitions is to consider all the possible events that can occur in the system, and for each event specify when it can occur (its *guard*) and what is its *effect* on the mutable relations. In our protocol we can distinguish three possible events:

* a node *initiates* the execution by sending its own identifier to the next node.
* a node reads and *processes* an identifier in its inbox, decides whether it should be propagated or discarded and, if it is its own identifier, elects itself as leader.
* the network *sends* one message in the outbox of a node to the next one.

Again, Alloy has no special keywords to declare events. A common
approach to do so is to declare a predicate for each event, that
specifies its guard and effect, and add a
constraint enforcing that at each possible state only one of those
predicates can hold. Event predicates can be also parametrised, in
particular by the node where the event occurs.

.. index::
   event; frame condition

Let's begin by specifying the initiate event. For each event we should
always start by specifying what is its guard, a constraint on the
current (or pre-) state that determines when can the event occur. To simplify our
specification, for the moment, we will assume that a node is free to initiate (or
reinitiate) the protocol whenever it wants, so in this case there will
be no guard, meaning the event is always possible. Then we should
specify the effect of the event on all mutable relations of the specification:
note that if nothing is said about a particular mutable relation then
its value can change freely when the event occurs. A special effect is
forcing the value of a mutable relation to not change at all,
something known as a *frame condition*. For example in this event we
will have two frame conditions, since it does not change the value of
:code:`inbox` nor :code:`Elected`. To specify that a relation does not
change with a logic formula we need to refer to its value in the next
(or post-) state. In Alloy the value of a relation (or relational
expression) in the next state is accessed by appending a prime. So, to
state that, for example, the relation :code:`inbox` does not change we
could add the constraint :code:`inbox' = inbox`. Likewise for
:code:`Elected`. So far, the specification of event initiate looks as
follows.

.. code-block::

   pred initiate[n : Node] {
     // node n initiates the protocol

     inbox'   = inbox                           // frame condition on inbox
     Elected' = Elected                         // frame condition on Elected
   }

We now need to specify the effect on relation :code:`outbox`. We can
start by specifying its effect on the outbox of the initiating
node. Expression :code:`n.outbox` denotes the set of identifiers in
the :code:`outbox` of node :code:`n` in the current state. The effect
of this event on this set is to add :code:`n`'s own identifier, which
can be specified by :code:`n.outbox' = n.outbox + n.id`. Here we used
the set union operator :code:`+` to add the singleton set :code:`n.id`
to :code:`n.outbox`. Note that requiring :code:`n.id` to be present in
the outbox in the next state, which could be specified by :code:`n.id
in n.outbox'`, is not sufficient, as it would allow identifiers other than :code:`n.id`
to be freely removed or added from :code:`n.outbox`. Having specified
the effect on the :code:`outbox` of `n` we now need to specify the
effect on the :code:`outbox` of all other nodes. Again, if nothing is
said about those, they will be allowed to vary freely. Of course this
event should not modify the outboxes of other nodes, something we
could specify with constraint :code:`all m : Node - n | m.outbox' =
m.outbox`, that quantifies over all nodes except the initiating one. The final specification of our event looks as follows.

.. code-block::

   pred initiate[n : Node] {
     // node n initiates the protocol

     n.outbox' = n.outbox + n.id                // effect on n.outbox
     all m : Node - n | m.outbox' = m.outbox    // effect on the outboxes of other nodes
     
     inbox'   = inbox                           // frame condition on inbox
     Elected' = Elected                         // frame condition on Elected
   }

Other, shorter or clearer ways to specify effects and frame conditions are presented in Section :ref:`specifying frame conditions`. In the remainder of this
chapter, we will continue to use the more verbose style, but you are
highly encouraged to use a terser style once it is familiar.

.. index::
   temporal operator; always
   
To validate our first event we can impose a restriction stating that,
for the moment, this is the only event that can occur, and inspect the
possible instances to see if the system behaves as expected. To
specify such a restriction we need the temporal logic operator
:code:`always` to enforce a formula to be true at all
possible states. In particular, to enforce that, at every possible
state, one of the nodes initiates the protocol, the following
fact can be added to the specification.

.. code-block::

   fact {
     // possible events
     always (some n : Node | initiate[n])
   }

Running command :code:`example` yields the following trace, where
:code:`Node2` initiates the protocol and then nothing else happens.

.. image:: instance10.png
   :width: 600
   :align: center

.. index::
   stuttering
           
On first glance, this may seem like an invalid trace, given that we
required that at every state one node initiates the protocol. However,
if we look closely at our specification of event :code:`initiate` we
can see that it also holds true in a state where a node already has
its own identifier in the outbox, which
will remain in the outbox. Our specification did not had a guard forbidding
this occurrence of the event, and the effect :code:`n.outbox' =
n.outbox + n.id` is also true if :code:`n.id` is already in
:code:`n.outbox` and in :code:`n.outbox'`.

To address this issue we could add a guard like :code:`n.id not in
n.outbox` to the specification of :code:`initiate`.  However, this
only forbids a node from initiating the protocol if it's own
identifier is at the moment in its outbox. This means that, once we
include an event to send messages and this identifier leaves the
outbox, a node will be able to reinitiate the protocol. This does not
pose any problem concerning the correctness of this particular
protocol, but if we really wanted each node to initiate the protocol
only once we would need to strengthen this guard.

One possibility would be to add a field to record at which phase of
the protocol execution each node is in, and allow this event only when
a node is in an "uninitiated" phase. This would be the common approach
in most formal specification languages. However, with Alloy we have a
more direct alternative. Since there is no special language to specify
events and they are specified by arbitrary logic formulas, we are free
to use temporal operators to specify guards or effects. By contrast,
in most formal specification languages we can only specify the
relation between the pre- and post-state.

.. index::
   temporal operator; historically   

In this case we could, for example, use the temporal operator
:code:`historically`, that checks if something was always true in the
past up to the current state, to only allow :code:`initiate` to occur
if the node's identifier was never in its outbox. The specification of
our event would look as follows.
          
.. code-block::

   pred initiate[n : Node] {
     // node n initiates the protocol

     historically n.id not in n.outbox          // guard

     n.outbox' = n.outbox + n.id                // effect on n.outbox
     all m : Node - n | m.outbox' = m.outbox    // effect on the outboxes of other nodes
        
     inbox'   = inbox                           // frame condition on inbox
     Elected' = Elected                         // frame condition on Elected
   }


Unfortunately, after adding this guard our command will
return no trace, meaning there are no behaviours that satisfy our
constraints. This happens because we are currently not
allowing messages to be sent or read, so after all nodes have
initiated the protocol nothing else is allowed to happen by our
specification, and it is impossible to obtain a valid trace. Recall
that traces are infinite sequences of states, where the system is
always evolving. A simple way to solve this issue is to add an event
allowing the system to *stutter*, that is to do nothing (keeping the
value of all relations unchanged). Stuttering can also be understood as
an event specifying something else that is occurring on the environment
outside the scope of our specification, and in general it is a good
idea to add such stuttering events. In this case, a
predicate to specify a stuttering event can be specified as follows.

.. code-block::
   
   pred stutter {
     outbox'  = outbox
     inbox'   = inbox
     Elected' = Elected
   }

To allow this event we need to change the above fact that restricts
the possible behaviours.

.. code-block::
   
   fact {
     // possible events
     always (stutter or some n : Node | initiate[n])
   }

.. index::
  Alloy Analyzer; enumeration

           
Running command :code:`example` returns a trace where nothing happens
in our system, something that is now allowed. If we press
:guilabel:`New Fork` we might get a trace such as the previously
depicted, where :code:`Node2` initiates the protocol and then nothing
else happens. If we move to the next state by pressing :guilabel:`→`,
and fork again with :guilabel:`New Fork`, we might get the following
trace where :code:`Node0` initiates the protocol after :code:`Node2`.

.. image:: instance11.png
   :width: 600
   :align: center

By exploring a bit more the possible traces we can get some confidence
that event :code:`initiate` is well specified.


Let us now specify the :code:`send` event. In this event we could have as parameters the node
:code:`n` from where the message will be sent and the message
:code:`i` that will be sent (an identifier). The guard of this event
should require :code:`i` to be in the :code:`outbox` of :code:`n`. The
effect will be to remove :code:`i` from the :code:`outbox` of
:code:`n`, and add it to the :code:`inbox` of the next node :code:`n.succ`. All
other inboxes and outboxes should keep their value, as well as the
:code:`Elected` set. The specification of this event would be the following.
      
.. code-block::

   pred send[n : Node, i : Id] {
     // i is sent from node n to its successor

     i in n.outbox                              // guard

     n.outbox' = n.outbox - i                   // effect on n.outbox
     all m : Node - n | m.outbox' = m.outbox    // effect on the outboxes of other nodes
     
     n.succ.inbox' = n.succ.inbox + i           // effect on n.succ.inbox
     all m : Node - n.succ | m.inbox' = m.inbox // effect on the inboxes of other nodes

     Elected' = Elected                         // frame condition on Elected
   }


.. index::
   logical operator; else
   
For event :code:`process` again we could have as parameters the node
:code:`n` and the message :code:`i` to be
read and processed. The guard should require that :code:`i` is in the
:code:`inbox` of :code:`n`. The effect on :code:`inbox` is obvious:
:code:`i` should be removed from the :code:`inbox` of :code:`n` and
all other inboxes should keep their value. The effect on the
:code:`outbox` of :code:`n` depends on the whether :code:`i` is
greater than :code:`n.id`: if so, :code:`i` should be added to the
:code:`outbox`, to be later propagated along the ring; if not, it
should not be propagated, meaning the :code:`outbox` of :code:`n` will
keep its value.  To write such conditional outcome we could use the
logical operator :code:`implies` together with an :code:`else`:
:code:`C implies A else B` is the same as :code:`(C and A) or (not C
and B)`, but the former is easier to understand.  The event has no
effect on the outboxes of other nodes, which should all keep their
value. The same conditional outcome applies to :code:`Elected`: if the
received :code:`i` is equal to the :code:`n`'s own identifier, then
:code:`n` should be included in :code:`Elected`; else :code:`Elected`
keeps its value. The full specification of :code:`process` is as
follows.

.. code-block::

   pred process[n : Node, i : Id] {
     // i is read and processed by node n

     i in n.inbox                                // guard

     n.inbox' = n.inbox - i                      // effect on n.inbox
     all m : Node - n | m.inbox' = m.inbox       // effect on the inboxes of other nodes

     gt[i,n.id] implies n.outbox' = n.outbox + i // effect on n.outbox
                else    n.outbox' = n.outbox
     all m : Node - n | m.outbox' = m.outbox     // effect on the outboxes of other nodes

     i = n.id implies Elected' = Elected + n     // effect on Elected
  	      else Elected' = Elected 
     }

We should now add these two events to the fact that constraints the
valid behaviours of our system.

.. code-block::

   fact {
     // possible events
     always (
       stutter or 
       some n : Node | initiate[n] or 
       some n : Node, i : Id | send[n,i] or
       some n : Node, i : Id | read[n,i]
     )
   }

.. index::
   instance; exploration

Before proceeding to the verification of the expected properties of
the protocol, we should play around with the different instance
exploration buttons to explore different execution scenarios and
validate the specification of our events. An alternative would be to
ask for specific scenarios directly in a :code:`run` command. For
example, we could change command :code:`example` to ask directly for a
scenario where some node will be elected. To do so we need to use the
temporal operator :code:`eventually`, which checks if a formula is
valid at some point in the future (including the present state). Our
command would look as follows.

.. code-block::

   run example {
     eventually some Elected
   } for exactly 3 Node, exactly 3 Id

Running this command will return a trace with 8 states where the
highest identifier is passed around until the respective node gets
elected. This is the shortest trace where a leader can be elected in a
ring with 3 nodes, corresponding to one initiate event followed by 3
interleaved send and read events. The Analyzer guarantees that the
shortest traces that satisfy (or refute) a property are returned
first. The first transition of this trace is the following, where the
node with the highest identifier initiates the protocol.

.. image:: instance12.png
   :width: 600
   :align: center

In the last transition, that can be focused by pressing :guilabel:`→`
6 times or by directly clicking the state numbered 6 in the trace
depiction, the node with the highest identifier reads its own
identifier that was passed around in the ring and elects himself as
leader.

.. image:: instance13.png
   :width: 600
   :align: center

   
Verifying the expected properties of the protocol
=================================================

The key property of our protocol is that eventually exactly one node
will become leader. We can break this property into to simpler ones:

- There will never be more than one leader
- Eventually there will be at least one leader

.. index::
   safety
   liveness
   fairness
  
These properties are of very different nature: the former is a
*safety* property, forbidding some (undesired) behaviour of the system,
while the latter is a *liveness* property, forcing some (desirable)
behaviour of the system. The analysis of safety properties is usually
simpler than the analysis of liveness properties. To find a
counter-example for a safety property it suffices to search for a
finite sequence of states that leads to a (bad) state (in this case
one where there are two or more leaders), and it is irrelevant what
happens afterwards, as any continuation of this finite sequence will
still be a counter-example. On the other hand, to find a
counter-example for a liveness property it is necessary to search for
a complete infinite trace where the expected behaviour definitely never
happened (in this case a leader is never elected). Moreover, it will
be necessary to impose additional *fairness* conditions when verifying
liveness properties, in particular to forbid bogus counter-examples
where at some point the system stutters forever and thus the expected
behaviour never happens.

.. index::
   invariant
   assert
   command; check

Let us start by verifying the first (safety) property. Recall that
properties that are expected to hold should be written in named
assertions (declared with keyword :code:`assert`) and then verified
with :code:`check` commands. This safety property is a very simple
example of an *invariant*, a property that requires something to be
be true in all states of all possible traces. In Alloy, invariants can
be specified using the temporal operator :code:`always`. In each state, to check that
there is at most one leader in set :code:`Elected` we could use
the keyword :code:`lone`. This invariant could thus
be specified as follows.

.. code-block::

   assert AtMostOneLeader { 
     always (lone Elected)
   }

  
To check this assertion we could define the following command.

.. code-block::

   check AtMostOneLeader

.. index::
   model checking; bounded model checking
   scopes; steps
   
This protocol is known to be correct, so, as expected, running this
command returns no counter-example. Recall that the default scope on
top-level signatures is 3, so this command verifies in a single
execution that the property holds for all rings with up to three
nodes. Moreover, by default, verification of temporal properties is
done with a technique known as *bounded model checking*, meaning that
the search for counter-examples will only consider a given maximum
number of different transitions before the system starts
exhibiting a repeated behaviour. By default this maximum number
is 10. To increase our confidence in the result of the analysis we
could, for example, check this property for all rings with up to 4
nodes and consider up to 20 different transitions (or steps). To do so
we could change the scope of the command as follows.

.. code-block::

   check AtMostOneLeader 4 but 20 steps

This commands sets the default scope on signatures to 4 but also
changes the scope on transitions to 20 using keyword
:code:`steps`. Again, as expected, no counter-example is returned, but
the analysis now takes considerably longer, as there are many more
network configurations and states to explore.

.. index::
   model checking; complete model checking

It is also possible to check a temporal property with *unbounded model
checking*, taking into consideration an arbitrary number of
transitions, but still with the signatures bounded by a maximum
scope. To do so, a special scope on :code:`steps` is needed
to trigger unbounded temporal analysis.

.. code-block::

   check AtMostOneLeader 4 but 1.. steps

.. index::
   Alloy Analyzer; options
   solver

However, for such analysis to be performed it is necessary to select a solver in the
:menuselection:`Options --> Solver` menu that supports it. By default, the Alloy Analyzer is not pre-packaged with
solvers that support unbounded model checking. You first need to
install either the `NuSMV <https://nusmv.fbk.eu>`_ or the `nuXmv
<https://nuxmv.fbk.eu>`_ model checker, and then select
:code:`Electrod/NuSMV` or :code:`Electrod/nuXmv` as solver,
respectively. The Analyzer shows that the safety property holds even
with an arbitrary number of transitions.

The second property is also a very simple example of a liveness
requirement, namely one that requires something to hold at least in
one state of every trace. Such properties can be specified directly
with the temporal operator :code:`eventually`. Inside this temporal
operator, again a very simple cardinality check on :code:`Elected`
suffices, this time with keyword :code:`some`.

.. code-block::

   assert AtLeastOneLeader {
     eventually (some Elected)
   }

.. index::
   command; check

The following command can be used to check this property with bounded model
checking and the default scopes.

.. code-block::

   check AtLeastOneLeader

Both :code:`Electrod/NuSMV` and :code:`Electrod/nuXmv` can be used to check this command, since they also support
bounded model checking. Alternatively, we can also choose one of the SAT solvers
pre-packaged with the Analyzer in the :menuselection:`Options -->
Solver` menu, for example the `SAT4J <https://www.sat4j.org>`_
solver. Feel free to experiment with different solvers when analysing
a property, as the performance can sometimes vary significantly just
by choosing a different solver. We recommend that you start by
verifying specifications with bounded model checking, because it is
usually much faster than the unbounded counterpart, and most
counter-examples tend to require only a handful of transitions. Once
you have a design where no counter-example is returned with bounded
model checking, you can increase your confidence on the analysis by using
unbounded model checking. As hinted above, running this command immediately returns a
counter-example where the system stutters forever, and obviously no
leader is elected.


.. image:: instance14.png
   :width: 600
   :align: center

.. index::
   event; enabled
           
This behaviour is allowed because we added a stutter event that
captures events external to our system. At the moment this event can
occur indefinitely, which is a bit unrealistic or *unfair* to the
system under analysis: for example, if at some point the system is
continuously ready to execute an event (the event is continuously
*enabled*), then it should eventually occur. For example, in our
protocol, a trace where at some point a message is in the inbox of a
node but is never read is unfair to that node.

To specify such fairness properties we first need to understand two
well-known combinations of the :code:`always` and :code:`eventually`
temporal operators:

- :code:`eventually always p` holds in a system if all traces reach a state where :code:`p` becomes valid indefinitely.
- :code:`always eventually p` holds in a system if in every state of every trace :code:`p` is later valid, which means that :code:`p` is true infinitely often.


To specify that an event is at some point continuously enabled we can
use the first combination. But first we need to specify when is a
particular event enabled (ready to occur). Most of the times this
condition is exactly the same as the guard of the event, but in some
events there can be additional conditions that derive from its
effect. In our protocol, the enabled condition for all events is the
same as the guard. For example, given a node :code:`n` and identifier
:code:`i`, event :code:`process[n,i]` is enabled when :code:`i in
n.inbox`.  So, to specify that event
:code:`process[n,i]` becomes continuously enabled at some point we
could write :code:`eventually always i in n.inbox`.  However, if, for example, the outbox of a node was limited
to contain at most one identifier, then we would have additional
enabling conditions, as the effect :code:`n.outbox' = n.outbox + i`
that occurs when :code:`gt[i,n.id]` would also require that
:code:`n.outbox` is empty.

To specify that a continuously enabled event eventually occurs (after
the point it becomes enabled) the second combination of
operators presented above can be used. Note that a continuously enabled event
would still be (continuously) enabled after the required occurrence of
the event, which means that the event should occur again and again, or
infinitely often. So, the desired fairness condition for
event :code:`process[n,i]` can be specified as follows.

.. code-block::

   (eventually always i in n.inbox) implies (always eventually process[n,i])

.. index::
   fairness; weak
   fairness; strong
   
Technically, this fairness condition is known as *weak fairness*, as
the event is only required to happen if continuously enabled. Often
this suffices to verify most liveness properties. However, sometimes
we may need *strong fairness* constraints, requiring that the event
occurs when it becomes recurrently enabled (infinitely often, but
not necessarily continuously). For event :code:`process[n,i]` such strong fairness
constraint would be specified as follows.

.. code-block::

   (always eventually i in n.inbox) implies (always eventually process[n,i])

In our example property, the liveness property holds if we require
weak fairness for all instances of three events of the system, as
specified in the following predicate.

.. code-block::

   pred fairness {
     all n : Node, i : Id {
       (eventually always historically n.id not in n.outbox) implies (always eventually initiate[n])
       (eventually always i in n.inbox)                      implies (always eventually process[n,i])
       (eventually always i in n.outbox)                     implies (always eventually send[n,i])
     }
   }

Finally, if we change the :code:`AtLeastOneLeader` assertion to assume
fairness, no counter-example will be returned even with increased
scopes, as expected.

.. code-block::

   assert AtLeastOneLeader {
     fairness implies eventually (some Elected)
   }
   
To better understand fairness constraints, suppose we commented
out the fairness restriction on event :code:`process`.

.. code-block::

   pred fairness {
     all n : Node, i : Id {
       (eventually always historically n.id not in n.outbox) implies (always eventually initiate[n])
   --  (eventually always i in n.inbox)                      implies (always eventually process[n,i])
       (eventually always i in n.outbox)                     implies (always eventually send[n,i])
     }
   }

The :code:`check AtLeastOneLeader` command would immediately return a
counter-example in a configuration with a single node, that initiates
the protocol, sends its own identifier to himself, and then never
processes that identifier (something that would be allowed at this point, since neither :code:`initiate` nor :code:`read` are enabled).
The last two states before the trace starts stuttering are shown below.

.. image:: instance15.png
   :width: 600
   :align: center

This trace is fair to :code:`initiate` and :code:`send`, but not to
:code:`read`, which is always enabled in the stuttering part of the
trace but never occurs.

           
.. warning:: Avoid fairness conditions in facts!

   Instead of a predicate we could have specified the fairness
   conditions inside a :code:`fact`, to force them to be true in all
   analyses. However, in general that is not a good idea, since
   :code:`run` commands would then only return fair traces, which
   hinders the interactive exploration of scenarios and requires
   much larger bounds on the number of steps (making the analysis
   slower). For example, command

   .. code-block::

      run {} for exactly 3 Node, exactly 3 Id

   that asks for any trace in a ring with 3 nodes will not even return
   a trace with the default scope of 10 :code:`steps`, because no fair
   trace with this size exists for this configuration. The minimum
   scope would be 14 :code:`steps`!

   This also means that a command like :code:`check AtMostOneLeader`,
   to verify the safety property of the protocol, would be mostly
   pointless, as with the default scope of 10 :code:`steps` only
   configurations with up to 2 nodes would be checked for
   correctness. Without fairness imposed in a fact, it would at least
   check that the safety property is not violated in the first 10
   steps of any trace of the protocol, including unfair and fair
   ones. Recall that in a counter-example of a safety property it is
   irrelevant what happens after we reach a faulty state: any
   continuation of the "bad" trace prefix would still be a
   counter-example, including any fair continuation.
   
Making the specification more abstract
======================================

.. caution:: This section is still under construction

.. code-block::

   open util/ordering[Node]

   sig Node {
     succ : one Node,
     var inbox : set Node
   }

   fact {
     // at least one node
     some Node
   }

   fact {
     // succ forms a ring
     all n : Node | Node in n.^succ
   }

   fact {
     // initially inbox is empty
     no inbox
   }

   fun Elected : set Node {
     { n : Node | n not in n.inbox and once (n in n.inbox) }
   }

   pred initiate[n : Node] {
     // node n initiates the protocol

     historically n not in n.succ.inbox          // guard

     n.succ.inbox' = n.succ.inbox + n         	 // effect on n.succ.inbox
     all m : Node - n.succ | m.inbox' = m.inbox  // effect on the inboxes of other nodes
   }

   pred process[n : Node, i : Node] {
     // i is read and processed by node n

     i in n.inbox                                     // guard

     n.inbox' = n.inbox - i                           // effect on n.inbox
     gt[i,n] implies n.succ.inbox' = n.succ.inbox + i // effect on n.succ.inbox
             else    n.succ.inbox' = n.succ.inbox
     all m : Node - n.succ - n | m.inbox' = m.inbox   // effect on the inboxes of other nodes
   }

   pred stutter {
     inbox' = inbox
   }

   fact {
     // possible events
     always (
       stutter or 
       some n : Node | initiate[n] or
       some n : Node, i : Node | process[n,i]
     )
   }

   run example {
     eventually some Elected
   } for 3 Node

   assert AtMostOneLeader { 
     always (lone Elected)
   }

   check AtMostOneLeader for 5 but 1.. steps

   pred fairness {
     all n : Node, i : Node {
       (eventually always historically n not in n.succ.inbox) implies (always eventually initiate[n])
       (eventually always i in n.inbox)  	              implies (always eventually process[n,i])
     }
   }

   assert AtLeastOneLeader {
     fairness implies eventually (some Elected)
   }

   check AtLeastOneLeader for 3 but 1.. steps

Exercises
=========


#. To practice temporal logic, access the following Alloy4Fun model and specify all the listed constraints. Each constraint is specified in a separate predicate, and for each there is a command that can be used to check its correctness. If the constraint is not correctly specified a counter-example will be shown. The counter-example will include an atom that clarifies if it is an example that the incorrect constraint is currently rejecting but should be accepted or one that is currently being accepted but should be rejected. Unlike in the Alloy Analyzer, in Alloy4Fun only one state of the trace is shown at a time. Use the green arrows to navigate back and forward in the trace. In the last state of the trace before looping back the forward arrow is circle shaped.

    a. `Messaging system <http://alloy4fun.inesctec.pt/o9LrywZ49JXGRwc3q>`_
    b. `Table reservations <http://alloy4fun.inesctec.pt/sKLTRjXh6Ad9o3XQn>`_

#. Specify `Dijkstra's self-stabilizing mutex algorithm <https://doi.org/10.1145/3544585.3544606>`_, that works on arbitrary rings of :code:`N+1` machines, with a distinguished *bottom* machine. In particular, specify the solution with :code:`K` state machines, both with and without a central daemon. The central daemon ensures that only one machine with the *privilege* executes at each time, otherwise an arbitrary number of machines can execute. The rules for determining if a machine has the privilege to execute are as follows: the bottom machine has the privilege if its state is equal to that of its neighbor; a non-bottom machine has the privilege if its state is different from that of its neighbor. When the bottom machine executes it increments its state modulo :code:`K`. When a non-bottom machine executes it just copies the state of its neighbor. The goal is to verify the expected *convergence* and *closure* properties. Convergence is a liveness property that ensures that whatever the initial state, the system will converge to a *legitimate* global state. In this case, a legitimate global state is one where exactly one machine has the privilege to execute and where every machine will eventually obtain that privilege. Closure is a safety property that ensures that after reaching a legitimate global state, the system will forever stay in a legitimate global state. According to the paper, without a central daemon both convergence and closure hold when :code:`K > N`, while with a central daemon it suffices that :code:`K >= N`. Use the Alloy Analyzer to verify if that is indeed the case. Notice that instead of using integers to model the local states of the machine, a ring of :code:`K` values can be used.

#. Specify `Chang's echo distributed protocol for forming a spanning tree <https://doi.org/10.1109/TSE.1982.235573>`_ in an arbitrary connected network with a distinguished *initiator*. In particular, specify *Algorithm 0* described in the paper, that works roughly as follows. The initiator starts by sending an *explorer* message to all its neighbors. Upon receiving a message, each node acts as follows: 

   - If the message is an explorer and it is the first to arrive at the node, mark the sender as the *first* neighbor and send an explorer to all neighbors except the first. Note that it is assumed that the initiator is considered to have been explored *a priori*.
   - If the message is an explorer but is not the first to arrive or there are no neighbors except the first, send an *echo* message back to the sender.
   - If the message is an echo, register that an echo as been received from the sender, and if echos from all neighbors (to which explorers were sent) have arrived send an echo to the first neighbor, unless the node is the initiator, in which case the protocol is finished.

   Verify the expected properties of the protocol, namely that it will eventually finish and that when it finishes the *first* relationship defines a spanning tree of the network rooted at the initiator.