Relative Safety in Relational Calculus

In classical model theory, there are two important questions:

  1. (Finite) Satisfiability: Given a first-order sentence ϕ\phi, is there a (finite) structure MM such that MϕM \models \phi?

  2. Model Checking: Given a first-order sentence ϕ\phi and a structure MM, is it true that MϕM \models \phi?

In fact, the history of the first question goes back to the inception of computer science. In 1936, Alonzo Church and Alan Turing independently proved that Satisfiability for first-order logic is undecidable, resolving David Hilbert’s Entscheidungsproblem. Their proofs also invented the two pillars supporting modern computer science: the lambda calculus and the Turing machine.

A less well-known result is that the Finite satisfiability is also undecidable. Specifically, if a sentence ϕ\phi contains a relation of arity at least 2, then it is undecidable to determine if ϕ\phi has a finite model. This result is due to Boris Trakhtenbrot.

On the other hand, the Model Checking problem is very easy. The data complexity (over the size of the model) of Model Checking over FOL (extended with ×\times and ++) is AC0AC^0, one of the lowest classes in the polynomial hierarchy. And even if we extend FOL with a least-fixpoint operator (i.e. Datalog), the problem is still in PTIME.

But in databases, we care about a slightly different question from Satisfiability and Model Checking:

  1. Relation Calculus Query: Fix a finite structure MM, and a first-order sentence ϕ(𝐱)\phi(\mathbf{x}) over MM, where 𝐱\mathbf{x} are the free variables of ϕ\phi. Let QQ be a relation symbol. Find a model for QQ such that 𝐱:Q(𝐱)ϕ(𝐱)\forall \mathbf{x} : Q(\mathbf{x}) \Leftrightarrow \phi(\mathbf{x}).

Note that QQ may not always have a finite model. For example, if ϕ(x):=x=x\phi(x) := x = x then QQ must contain all values in the domain which can be infinite. We may therefore want to ask a question similar to Finite Satisfiability:

  1. Relative Safety: Given a Relational Calculus query and a fixed database, does the query have a finite answer on that database? That is, is there a finite model for QQ, defined as above?

This problem sits in between the Finite Satisfiability problem and the Model Checking problem: it asks for a model for QQ, but we already have the models for all other relations. Relative Safety was proposed by [AGSS86], which is written in Russian. Other papers cite it for a proof that Relative Safety is decidable, but I don’t read Russian, so I had to do the proofs myself. Luckily, the proof is actually not very complicated.

Theorem Consider Relational Calculus with the following syntax:

Q:=x=tR(t1,,tk)¬QQQQQx.Q Q := \bot \mid \top \mid x = t \mid R(t_1, \ldots, t_k) \mid \neg Q \mid Q \vee Q \mid Q \wedge Q \mid \exists x .Q

In short, it is the standard FOL where the only interpreted predicate is equality. The Relative Safety is decidable.

Proof Because the database contains only a finite set of finite relations, any infinite query output must contain some value that is not in the database. The converse is also true: if the query output contains any value not in the database, the output must be infinite. This is because the query cannot distinguish values not in the database, because the only predicates are relation symbols or =. Therefore, if we see one output tuple containing some vv not in the database, we can always change vv to another value not in the database to get another tuple.

With the above insight, we can easily test if a query is infinite using a finite number of values not in the database. We can’t use just one, because the sentence may contain a predicate v=vv = v' where both vv and vv' are free, in which case we may want to use two different values for them. The algorithm is as follows:

Let CC be the (finite) set of values appearing in the database. Let V={v1,,vn}V = \{ v_1, \ldots, v_n \} be a set of nn values not in the database, where nn is the number of free variables of QQ. For each tuple t=(t1,t2,)t=(t_1, t_2, \ldots) where each tiCVt_i \in C \cup V, check if ϕ(t)\phi(t) is valid (model checking). If it is, and if tt contains a value not in CC, then the query is infinite. If we exhaust all tuples (CVC \cup V is finite) without finding such a tt, the query is finite.

[AGSS86] Alfred K. Ailamazyan, Mikhail M. Gilula, Alexei P. Stolboushkin, and Grigorii F. Schwartz. Reduction of a relational model with infinite domains to the case of finite domains. Doklady Akademii Nauk SSSR, 286(2):308–311, 1986. URL: