Copyright ACM, 2000

Static Detection of Sources of Dynamic Anomalies in a Network of Referential Integrity Restrictions

 

Laura C. Rivero

 ISISTAN - UNCentro

 LINTI - U.N.La Plata

 B.A. Argentina

Phone: +54 2293 440363

lrivero@exa.unicen.edu.ar

Jorge H. Doorn

ISISTAN - UNCentro,

B.A. Argentina

Phone: +54 2293 440363

 

jdoorn@exa.unicen.edu.ar

Daniel Loureiro  

UNCentro,

B.A. Argentina

Phone: +54 2293 440363

 


 


ABSTRACT

 

Under certain circumstances, basic operations over tables in a relational database, where integrity restrictions such as referential and null restrictions have been specified, may produce unpredictable results, not detectable by means of a static analysis of the schema. When the design includes redundancies or when the set of restrictions is contradictory it is easy to detect and prevent future errors, but there are situations that require a dynamic analysis. In this paper, the properties of networks of referencial integrity restrictions that contain irregularities are analyzed, and the anomalies that may appear when data actualization in such environment is done are studied in order to define criteria and develop an algorithm to generate rules for proper handling of inconsistencies.

Keywords: integrity restrictions, referential integrity,  database updates, anomalous updates.

 

1.  INTRODUCTION

Semantic data control ensures the maintenance of database consistency, by rejecting update transactions that lead to inconsistent states or by activating specific actions on the database state to compensate the effect of the previous transaction. Unpredictable results may be obtained in a relational particular database state, when basic operations are applied. This inconvenience is produced by redundancies in the schema design, by contradictory referential integrity restrictions or simply because the designer established complex restrictions having tuple dependent semantic. From a procedural point of view, it means that the result may be unpredictable because it is affected by the order in which individual restrictions are applied or by the order in which the constraints are enforced.


 Those problems are well-known and there are research reports in relation with some aspects of them (see e.g. [Markowitz90], [Markowitz91], [Casanova88], [Rivero96]) but the formalization of conditions and the implementation of mechanisms for the automatic prevention of those anomalies is a recent investigation field. In [Markowitz94] the uncertainties produced by specific data manipulation are described, especially in delete operations. In [Casanova89] special cases of updates propagation are described.

This research paper may be divided in three well-defined parts. The first is developed in sections 3 and 4 and is devoted to refine Markowitz study extending it to all operations. The second is dedicated to the specification of algorithms able to detect potential sources of anomalies in a static way (Section 5) [Rivero98]. The last part (Section 6.) presents an algorithm that automatically generates rules that allows the integrity verification.

2.  RELATIONAL CONCEPTS

A relational schema is R º<R, D>, with R={R1, R2, ..., Rm} and D={FD, ID, NR} set of relations, FD and ID set of functional dependencies (fd) and inclusion dependencies (id) respectively and NR set of null restrictions (null constraint and null not allowed). A database state for R is denoted by r ={r1, r2, ..., rm}; sch(Ri) represents the set of attributes of Ri, Ki stands for a candidate key over Ri; and FK represents a foreign key for Ri. A database state r associated with R is consistent if it satisfies all restrictions in D. Two attribute sets overlap iff they share attributes (YÇZ¹Æ), and strictly overlap iff they overlap but they are not equal ((YÇZ¹Æ) Ù (Y¹Z)).

A functional dependency (fd) over a set of attributes U is one expression of the form X®Y, where X,YÍU. If RieR, fd’s over Ri will be indicated by Ri:X ® Y.

A null constraint (nna) may be expressed by Ri:Li ¹l. It is satisfied by Ri iff for every tuple t of Ri the subtuple t.Li has only not null values. There is at least one nna in Ri, that is Ri: Ki ¹l.

One inclusion dependency (id) in R  is an expression Ri[X]<<Rj[Z]:(a,b,mi,md), where Ri and Rj are relation names (possibly the same); X,ZÍsch(Rj) are compatible attributes; a, b, mi and md are the strategies to perform inserts, deletes and updates over the left and right side respectively, and the strategies may be c (Cascades), r (Restricted) or n (Nullifies). All combinations are studied in this article, however inserts and updates over the left side are generally done using a y mi specified with modality ‘r’. If Z is the primary key of the relation Rj, then it is a key-based-id and X constitutes a foreign key for Ri; this sort of id’s are named referencial integrity restrictions (rir’s).

The referential integrity directed graph G=(V,H), associated with R. may be defined with V=R and

H = {(Ri,Rj,L:(a,b,mi,md)) /Ri[L]<<Rj[Kj]:[a,b,mi,md] ÎID}. H is composed by elements (Ri,Rj, L: (a, b, mi,  md)), where  the edge goes from  Ri to  Rj and L:(a,b,mi, md) is the label of the edge.

3.  CONFLICTIVE MANIPULATIONS

In order to characterize the problem that may arise when some updates are performed over data constrained by rir's the examples in Figures 1 and 2 will be used.


Example 1:


·          Relations (keys are underlined)

(R1) Employee ( NBR-E, NBR-S, NBR-M, NBR-D, NBR-P )     

(R2) Manager ( NBR-M, NBR-P )

(R3) Project ( NBR-P)

(R4) Department ( NBR-D, NBR-P )

·          Null restrictions

(N1) Employee: NBR-E ¹ l

(N2) Manager: (NBR-M, NBR-P) ¹ l

(N3) Project: NBR-P ¹ l

(N4) Department: (NBR-D, NBR-P) ¹ l

·          Referential Integrity Restrictions

(I1) Manager[NBR-M]<<Employee[NBR-E]:(c,c,c,c)

(I2) Employee[NBR-S]<<Employee[NBR-E]:(n,r,n,r)

(I3) Employee[NBR-D, NBR-P]<<Manager[NBR-M, NBR-P]:(c,c,c,n)

(I4) Manager[NBR-P]<<Project[NBR-P]:(r,r,c,c)

(I5) Employee[NBR-P, NBR-D]<<Department[NBR-P, NBR-D]:(c,c,c,c)

(I6) Department[NBR-P]<<Project[NBR-P]:(c,c,r,c)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 1: Restriction Graph, State and Restrictions for Example 1 (adapted from [Markowitz94])

Example 2:


·          Relations (keys are underlined)

(R1) Films ( FILM#, PROD#, DIR#, Subject )

(R2) Staff ( PERS#, NAME )

·          Nulls not allowed Restrictions

(N1) Films: FILM# ¹ l

 (N2) Staff: PERS# ¹ l

·          Referential Integrity Restrictions

(I1) Films[PROD#]<<Staff[PERS#]:(a1,b1,mi1,md1)

(I2) Films[DIR#]<<Staff[PERS#]:(a2,b2,mi2,md2)

 

 

 

 

 

 

 

 

 

 

 

Figure 2: Restrictions Graph, State and Restrictions for Example 2.


The data manipulations considered are insertions, deletions or updates of one or several tuples; a manipulation involves only one kind of operation, it refers to an unique relation and entails a set of tuples that do not change during the execution of the manipulation. The specified constraints will be verified after each single tuple operation. The manipulation will be successful if all of its single tuple operations are carried out, otherwise it fails (is revoked). The above two examples present more than just one anomalous behavior. In the following sections, simplified subschemes of them are used to illustrate different problems.

3.1. Problems in Insert Operations

The effect of insertions is examined in this section.

Example 1-1: Consider the rir's and the database state depicted in Figure 1, excluding in this case I1 and I2. The operation I  inserts the tuple (5 4 4 d x) in the relation r1. I  may cause the insertion (via I5 and I6) of the tuple (d) in the relation r3, or it may block the insert operation (via I3 and I4). The result of I depends on the order in which the rir’s that involve R1 are enforced: (i) if I5 is first considered, the tuple (d x) is inserted in r4, this triggers the enforcing of I6, which provokes the insertion of the tuple (d) in r3, then I3 will cause the insertion of the tuple (4 d) in r4; or (ii) if I3 is taken into account in the first place, the tuple (4 d) is inserted in r2, and this triggers the enforcing of I4 which blocks the insert operation since the tuple (d) does not exist in the relation r3.

Example 2-1: Consider the example in Figure 2 and the corresponding state of the database r = {r1,r2}. If  a1: r and a2: c and the operation I inserts the tuple (5 6 6 v) in the relation r1, it may trigger the insertion (via I2) of the tuple (6 l) in r2, or the insert operation may be blocked (via I1). The result of I depends on the order in which the rir’s that involve a R1 are applied: (i) if I2 is enforced in first place then the tuple (5 6 6