Static Detection of Sources of Dynamic Anomalies in a Network of
Referential Integrity Restrictions
|
Laura C.
Rivero ISISTAN
- UNCentro LINTI - U.N.La Plata B.A.
Argentina Phone: +54 2293 440363 lrivero@exa.unicen.edu.ar |
Jorge H. Doorn
ISISTAN -
UNCentro, B.A.
Argentina Phone: +54 2293 440363 jdoorn@exa.unicen.edu.ar |
Daniel Loureiro
UNCentro, B.A. Argentina Phone: +54 2293 440363 |
ABSTRACT
Under
certain circumstances, basic operations over tables in a relational database,
where integrity restrictions such as referential and null restrictions have
been specified, may produce unpredictable results, not detectable by means of a
static analysis of the schema. When the design includes redundancies or when
the set of restrictions is contradictory it is easy to detect and prevent
future errors, but there are situations that require a dynamic analysis. In
this paper, the properties of networks of referencial integrity restrictions
that contain irregularities are analyzed, and the anomalies that may appear
when data actualization in such environment is done are studied in order to
define criteria and develop an algorithm to generate rules for proper handling
of inconsistencies.
1. INTRODUCTION
Semantic data control ensures the maintenance of database consistency,
by rejecting update transactions that lead to inconsistent states or by
activating specific actions on the database state to compensate the effect of
the previous transaction. Unpredictable results may be obtained in a relational
particular database state, when basic operations are applied. This
inconvenience is produced by redundancies in the schema design, by
contradictory referential integrity restrictions or simply because the designer
established complex restrictions having tuple dependent semantic. From a
procedural point of view, it means that the result may be unpredictable because
it is affected by the order in which individual restrictions are applied or by
the order in which the constraints are enforced.
Those problems are well-known and there are research
reports in relation with some aspects of them (see e.g. [Markowitz90],
[Markowitz91], [Casanova88], [Rivero96]) but the formalization of conditions
and the implementation of mechanisms for the automatic prevention of those
anomalies is a recent investigation field. In [Markowitz94] the uncertainties
produced by specific data manipulation are described, especially in delete
operations. In [Casanova89] special cases of updates propagation are described.
This research paper may be divided in three well-defined parts. The
first is developed in sections 3 and 4 and is devoted to refine Markowitz study
extending it to all operations. The second is dedicated to the specification of
algorithms able to detect potential sources of anomalies in a static way (Section
5) [Rivero98]. The last part (Section 6.) presents an algorithm that
automatically generates rules that allows the integrity verification.
2. RELATIONAL CONCEPTS
A relational schema is R º<R, D>, with R={R1, R2, ..., Rm} and D={FD, ID, NR} set of relations, FD and ID set of functional dependencies (fd) and inclusion
dependencies (id) respectively and NR
set of null restrictions (null constraint and null not allowed). A database
state for R is denoted by r ={r1, r2,
..., rm}; sch(Ri) represents
the set of attributes of Ri, Ki stands for a candidate key over Ri; and FK represents a foreign
key for Ri. A database state r associated with R is
consistent if it satisfies all restrictions in D. Two attribute sets overlap iff
they share attributes (YÇZ¹Æ), and strictly overlap iff they overlap but they are
not equal ((YÇZ¹Æ) Ù (Y¹Z)).
A functional dependency (fd) over a set of attributes U is
one expression of the form X®Y, where X,YÍU. If RieR, fd’s over Ri
will be indicated by Ri:X ® Y.
A null constraint (nna) may be expressed by Ri:Li ¹l. It is satisfied by Ri iff
for every tuple t of Ri the subtuple t.Li has only not null values.
There is at least one nna in Ri, that is Ri:
Ki ¹l.
One inclusion dependency (id) in R is an expression Ri[X]<<Rj[Z]:(a,b,mi,md), where Ri
and Rj are relation names (possibly the same); X,ZÍsch(Rj) are compatible attributes; a, b, mi and md are the strategies to perform inserts, deletes and updates over the
left and right side respectively, and the strategies may be c (Cascades), r (Restricted) or n (Nullifies).
All combinations are studied in this article, however inserts and updates over
the left side are generally done using a y mi specified with modality ‘r’. If Z is the
primary key of the relation Rj, then it is a key-based-id and X
constitutes a foreign key for Ri; this sort of id’s are named
referencial integrity restrictions (rir’s).
The referential integrity directed graph G=(V,H), associated with R. may be defined with V=R and
H = {(Ri,Rj,L:(a,b,mi,md)) /Ri[L]<<Rj[Kj]:[a,b,mi,md] ÎID}. H
is composed by elements (Ri,Rj,
L: (a, b, mi, md)), where the edge goes from Ri
to Rj
and L:(a,b,mi, md) is the label of the edge.
3. CONFLICTIVE MANIPULATIONS
In order to characterize the problem that may arise when some updates
are performed over data constrained by rir's the examples in Figures 1 and
2 will be used.
Example
1:
|
·
Relations (keys are underlined) (R1) Employee ( NBR-E,
NBR-S, NBR-M, NBR-D, NBR-P ) (R2)
Manager ( NBR-M, NBR-P ) (R3)
Project ( NBR-P) (R4)
Department ( NBR-D, NBR-P ) ·
Null restrictions (N1)
Employee: NBR-E ¹ l (N2)
Manager: (NBR-M, NBR-P) ¹ l
|
(N3)
Project: NBR-P ¹ l (N4)
Department: (NBR-D, NBR-P) ¹ l ·
Referential Integrity Restrictions (I1)
Manager[NBR-M]<<Employee[NBR-E]:(c,c,c,c) (I2)
Employee[NBR-S]<<Employee[NBR-E]:(n,r,n,r) (I3) Employee[NBR-D,
NBR-P]<<Manager[NBR-M, NBR-P]:(c,c,c,n) (I4)
Manager[NBR-P]<<Project[NBR-P]:(r,r,c,c) (I5)
Employee[NBR-P, NBR-D]<<Department[NBR-P, NBR-D]:(c,c,c,c) (I6) Department[NBR-P]<<Project[NBR-P]:(c,c,r,c) |
Figure
1: Restriction Graph, State and Restrictions for Example 1 (adapted from
[Markowitz94])
Example 2:
|
·
Relations (keys are underlined) (R1)
Films ( FILM#, PROD#, DIR#, Subject ) (R2)
Staff ( PERS#, NAME ) ·
Nulls not allowed Restrictions (N1)
Films: FILM# ¹ l |
(N2) Staff: PERS# ¹ l ·
Referential Integrity Restrictions (I1)
Films[PROD#]<<Staff[PERS#]:(a1,b1,mi1,md1) (I2)
Films[DIR#]<<Staff[PERS#]:(a2,b2,mi2,md2) |

Figure 2: Restrictions Graph, State and Restrictions for Example 2.
The data manipulations
considered are insertions, deletions or updates of one or several tuples; a manipulation
involves only one kind of operation, it refers to an unique relation and
entails a set of tuples that do not change during the execution of the
manipulation. The specified constraints will be verified after each single
tuple operation. The manipulation will be successful if all of its single tuple
operations are carried out, otherwise it fails (is revoked). The above two
examples present more than just one anomalous behavior. In the following
sections, simplified subschemes of them are used to illustrate different
problems.
3.1. Problems in Insert
Operations
The effect of insertions is examined in this section.
Example 1-1: Consider the rir's and the database state
depicted in Figure 1, excluding in this case I1 and I2.
The operation I inserts the tuple (5 4 4 d x) in the relation r1. I may cause the insertion (via I5 and
I6) of the tuple (d) in
the relation r3, or it may block the insert
operation (via I3 and I4). The result of I depends
on the order in which the rir’s that involve R1 are
enforced: (i) if I5 is first considered, the tuple (d x) is inserted in r4, this triggers the enforcing of I6, which provokes the
insertion of the tuple (d) in r3,
then I3 will cause the insertion of the tuple (4 d) in r4; or (ii) if I3 is taken
into account in the first place, the tuple (4 d) is inserted in r2, and this triggers the enforcing of
I4 which blocks the insert operation since the tuple (d) does not exist in the relation r3.
Example 2-1: Consider the example in Figure 2 and the corresponding state of the database r = {r1,r2}. If a1: r and a2: c and the operation I inserts the tuple (5 6 6 v) in the relation r1, it may trigger the insertion (via I2) of the tuple (6 l) in r2, or the insert operation may be blocked (via I1). The result of I depends on the order in which the rir’s that involve a R1 are applied: (i) if I2 is enforced in first place then the tuple (5 6 6