Abstract Interpretation Notes
1. Transition systems
A transition system $(S, T, I)$ is defined as follows:
 a set of $S$;
 a transition relation $T: S \rightarrow S$;
 a set of initial states $I \subseteq S$.
A program is modelled as a transition system. Each state of the program is an evaluation of all variables in the program. Each state may has many possible successors, i.e., the transition relation is nondeterministic.
Example1 : $S = \{a,b,c\}$, $I = \{a\}$ and $T = \{ \langle a,b\rangle, \langle b,b\rangle, \langle b,c\rangle\}$
2. Semantics
2.1 maximal trace semantics
The program can be modelled as a transition system. Thus, en execution of the program is a sequence of states $\langle s_0,s_1,…,s_n\rangle $. The execution can go on forever or ends at some state $s_k$ ( $s_k$ has no successors) or an error state $\Omega$ . There be many executions due to the nondeterminism.
The set of execution traces spawned by a transition system with initial states $I$ and the transition relation $T: S \rightarrow S$ is denoted as $\Delta$ and it is called maximal trace semantics .
$\Delta = \{ \langle s_0...s_n \rangle  n \geq 0 \land s_0 \in I \land ( \forall i \in [0,n) \cdot \langle s_i,s_{i+1} \rangle \in T) \land$
$\ \ \ \ \ \ \ (\forall s' \in S \cdot \langle s_n, s'\rangle \notin T) \}. \cup \{ \langle s_0 ...\rangle  s_0 \in I \land \forall i > 0 \cdot \langle s_i, s_{i+1} \rangle \in t \}$
$\Delta$ is the set of maximal sequences of states starting from an initial state, either ending in some final state or infinite and satisfying the transition relation.
In practice, the maximal trace semantics of the program is not computable and not observable since it could be infinite.
To check some safety properties, ie., nothing bad happens, we can observe the finite parts of the program executions (finite or infinite).
However, this will prevents us from reasoning about program termination and liveness properties. (something good happens)
Example1 :
$ \Delta = { \langle ab…c \rangle, \langle ab …\rangle } $
2.2 collecting semantics
2.2.1 prefix trace semantics
We can define finite observation of program execution $P$.
$P = { \langle s_0 … s_n\rangle  n \geq 0 \land s_0 \in I \land \forall i \in [0,n)\cdot \langle s_i, s_{i+1} \in T\rangle } \subseteq S^* $.
($S^*$ is the set of finite sequences of states form $S$).
$P$ is called the finite prefix trace semantics can it is as an collecting semantics.
Example1: In our finite transition system example, the prefix trace semantics is $P = \{\langle a\rangle,\langle ab . . . b\rangle,\langle ab . . . c \rangle\}$. It contains the one element trace a, as well as traces starting in a followed by a finite sequence of b and ending in b, and traces starting in a followed by a finite (nonempty) sequence of b and ending in c. An example of safety property would be that c cannot occur unless b has occurred before during the execution. The property can be checked by considering only $P$, and it is not necessary to consider $\Delta$. However, observing $P$ only, it is impossible to deduce whether the program always terminates or not, as any finite trace in $P$ may actually be the beginning of a longer (possibly infinite) trace.
Now, by computing prefix trace semantics, we can ideally answer all safety questions. Unfortunately, this is not possible: although each trace is now finite and can, individually, be checked automatically, the prefix trace semantics may contain an infinite (or impracticably large) number of such finite traces. Hence, we will use further abstractions of this semantics to provide sound but incomplete answers. The choice of the program properties of interest, hence of the collecting semantics, is problem dependent. It depends on the level of observation of program behaviors required to solve the problem. We illustrate this point with another example of collecting semantics.
2.2.2 reachability semantics
If we only interest in invariance properties, a possible collecting semantics would be the reachability semantics.
$R = \{ s'  \exists s \in I\cdot \langle s, s'\rangle \in t^* \} \subseteq S$
where the reflexive transitive closure $t^* \subseteq S \times S$ of a relation $t \subseteq S \times S$ is defined classically:
$t^* = \{ \langle s,s'\rangle  \exists n \geq 0 \cdot \exists s_0...s_n \cdot s_0 = s \land \forall i \in [0,n) \cdot \langle s_i, s_{i+1}\rangle \in t \land s_n = s' \}$
As an invariance property is defined as a set of states in which all program executions must stay, computing the reachability semantics is indeed sufficient to check the property. It reduces to checking that $R$ is included in the state set of the property. The reachability semantics is more abstract than the prefix trace semantics since, although we know exactly which states can be reached during execution, we no longer know in which order. This is a basic example of abstraction.
Assume we are asked the question “does state s1 always appear before state s2 in all executions” and we only know the reachability semantics $R$. If $s_1\in R$ or $s_2 \in R$, then we can answer “no” for sure. Otherwise $s_1, s_2 \in R$, so, we can only answer “I don’t know,” which is an example of incompleteness which is inherent to reachability with respect to the prefix trace semantics.
Example1: The reachable states of the example 1 is $R = \{ a,b,c \}$
2.2.3 fixpoint semantics
Define a identify relation: $id_S = \{ \langle s, s \rangle  s \in S\}$
$t : S \rightarrow S$
Composition $\circ$ of relations:
$t \circ r = \{ \langle s, s'' \rangle  \exists s': \langle s, s' \rangle \in t \land \langle s',s'' \rangle \in r \}$
A fixpoint of a function $F: S\rightarrow S$ is any element x satisfying $F(x) = x$.
The least fixpoint of $F$, denoted by $lp$, satisfying two conditions:
 $F(lp) =$ $ \texttt{lp}$, itself is a fixpoint
 $\forall x,F(x)= x$ such that $ \texttt{lp} \sqsubseteq x$, it is smaller than any other fixpoint
If the least fixpoint exists, it is unique.
The reachable states $R$ is the fixpoint of function:
$F_R(X) = I \cup \{s' \exists s\in X \cdot \langle s,s' \rangle \in T \}$.
Example1: We iterate $F_R$ from $\emptyset$ and get
$X^0 = F_R (\emptyset) = I = \{ a \}$
$X^1 = F_R(X^0) = \{ a, b\}$
$X^2 = F_R(X^1) = \{a,b,c\}$
$X^3 = F_R(X^2) = \{ a,b,c\} = X^2$
$X ^n = X ^2, n\geq2$
3. abstraction functions
We defined several semantics we can construct the relation between them by notion of $abstraction$. An abstraction is a mathematical relationship between two semantics, a socalled concrete semantics (more precise) and s socalled abstract semantics (less precise).
Soundness of abstraction: any program property proved to hold in the abstract semantics also holds in the concrete one.
Generally, not all properties provable in the concrete semantics can be proved in the abstract semantics: this is incompleteness, which causes false alarms.
Another property of the abstract semantics is termination.
Example: Construction a correspondence between the (concrete) prefix trace semantics and the (abstract) reachability semantics. We denote by $S^* $ the states in prefix trace semantics which is the set of prefix trances. Thus, the abstraction function is defined as:
$\alpha_R: Pow(S^*) \rightarrow Pow(S)$ such that
$\alpha_R(X) = \{ s\ \ \exists \langle s_0...s_n \rangle \}\in X : s=s_n\}$.
Given a set of finite sequences in $Pow(S^*)$ as input, it outputs the set of states in $Pow(S)$ that can appear at the last position of the sequence.
The prefix trance semantics collects all finite observations of a program execution and store the last state of every such finite observation, it collects the whole set set of stets reached along all (finite and infinite) executions. The abstraction records the reachable states but no longer the order in which these states appear during execution.
The abstraction function $\alpha$ is monotonic, i.e., we have that $X \subseteq Y$ implies $\alpha_R(X) \subseteq \alpha_R(Y)$.
4. Concretisation functions
An abstraction function converts a semantic information from a more concrete semantic world to a more abstract semantic world. We can also consider a reverse function, called concretization function, to convert from an abstract world back to a concrete one.
The counterpart of the abstract function $\alpha$ is the concretization function:
$\gamma_R: Pow(S) \rightarrow Pow(S^*)$ such that
$\gamma_R(X) = \{ \langle s_0...s_n \rangle \in X  \forall i \in [0,n]: s_i \in X\}$,
given a set of states in $Pow(S)$ outputs a set of finite sequences of states in $Pow(S^*)$.
The concretization rebuilds the prefix execution traces from the reachable states, but considers that they can appear in any order since the order of appearance has been abstracted away.
 Soundness: $R = \alpha_R(P)$
 Incompleteness: $P \subseteq \gamma_R ( R ) $, $\gamma_R (R)$ has more possible program bahaviours than the concrete $P$.
Example 1: Recall the example represented above, we have reachable states: $R = \{ a,b,c\}$, and prefix trance semantics: $P = \{ \langle a\rangle,\langle ab . . . b\rangle,\langle ab . . . c \rangle \}$
We can check that $\alpha_R(P) = R$,
$S=\{ \emptyset, \{a\},\{b\},\{c\},\{a,b\},\{a,c\},\{b,c\},\{a,b,c\} \}$
$\gamma( R ) = S^* $
Property: no $c$ can occur in an execution without a $b$ occurring before.
The property is true in prefix trance semantics $P$, but not in $\gamma(R)$.
5. Galois connections
Intuitively, the abstraction and the concretization functions are “inverse” of one another. This intuition is formalized by the notion of Galois connection $\langle \alpha, \gamma \rangle$. Formally, a Galois
connection $\langle \alpha, \gamma \rangle$ is a pair of functions between two ordered sets: a
concrete set (C, ⊆) equipped with some partial order $\subseteq$, and an abstractset ($A,\subseteq^{\sharp}$), equipped with another partial order $\subseteq^{\sharp}$, such that:

$\alpha :C \rightarrow A$ converts a concrete element into an abstract one;

$\gamma : A \rightarrow C$ converts an abstract element into a concrete one;

$\forall x \in C,y \in A : \alpha(x) \subseteq^{\sharp} y$ if and only if $x \subseteq \gamma ( y )$, which is the core property of Galois connections.
6. Sound and complete abstract semantics
Given a concrete semantics $S$ and an abstraction specified by a concretization function $\gamma$ (respectively, an abstraction function $\alpha$), we are interested in an abstract semantics $S^{\sharp}$ which is sound in that $S \subseteq \gamma (S^{\sharp})$ (respectively, $\alpha(S) \subseteq^{\sharp} S^{\sharp}$, which is equivalent for Galois connections). This corresponds to the intuition that no concrete case is ever forgotten in the abstract (so there can be no false negative).