Skip to main content
eScholarship
Open Access Publications from the University of California

UC San Diego

UC San Diego Electronic Theses and Dissertations bannerUC San Diego

Foundations of information integration

Abstract

We study three fundamental problems in information integration: 1. the data integration query problem, 2.the data exchange core computation problem, and 3. the schema mapping composition problem. The first problem consists of computing the certain answers to a query over a target schema for a source instance under constraints which relate the source and target schemas. We show how to compute certain answers for a larger family of constraints and queries than those previously addressed. One of the main tools is the chase, which we study and extend significantly. The second problem deals with inserting data from one database into another database having a different schema. Fagin, Kolaitis, and Popa have shown that among the universal solutions of a solvable data exchange problem, there exists -- up to isomorphism -- a most compact one, ̀̀the core'', and have convincingly argued that this core should be the database to be materialized. We show how to compute the core in the general setting where the mapping between the source and target schemas is given by source-to-target constraints which are arbitrary tuple generating dependencies (TGDs) and target constraints consisting of equality generating dependencies (EGDs) and weakly-acyclic TGDs. The third problem, composition of mappings between schemas, is essential to support schema evolution, data exchange, data integration, and other data management tasks. We study the issues involved in composing schema mappings given by embedded dependencies that need not be source-to-target and we concentrate on obtaining (first-order) embedded dependencies. We provide a composition algorithm and several negative results. In particular, we show that even full dependencies that are not limited to be source-to- target are not closed under composition and that determining whether the composition can be given by these kinds of dependencies is undecidable. These negative results carry over to mappings given by embedded dependencies

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View