Full Citation
Title: Uninterpreted Semi-Automatic Schema Matching Approach Using Inter-Attribute Dependencies
Citation Type: Miscellaneous
Publication Year: 2012
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: Schema matching is aimed at identifying semantic correspondences between elements of two database schemas. It is one of the key challenges in many database applications such as data integration and data warehousing. Before any data can be integrated, table columns in the two databases should be matched. It is a strenuous and time consuming process. To cope with this problem, many automated/semi-automated solutions have been proposed. Most of the existing solutions mainly rely on textual similarity of the data to be matched. While these approaches are valuable in many cases, they are not enough, and there exist instances of the schema matching problem for which they do not even apply. Such problem instances typically arise when the column names in the schemas and the data in the columns are opaque or difficult to interpret. Our research scope is focused on the uninterpreted matching. In this paper, we propose a five-step schema matching technique. In the first step, we find dependencies between attributes in each table. In the second step, we compute pairwise mutual information between dependent attributes only and construct a dependency graph using the mutual information as weights on arcs between attributes. In the third step, if the number of attributes in each table is different we add dummy nodes in order to complete to the same number of attributes. In the fourth stage, we find matching node pairs in the dependency graphs by running a graph-matching algorithm. In the fifth stage, we remove all attributes which are mapped to the dummies and present the results to the user. We validate our approach with experiments which show that this approach can be a useful addition to a set of existing automatic/semi- automatic schema matching techniques.
User Submitted?: No
Authors: Last, Mark; Rabinovich, Boris
Publisher: CSO-NATO
Data Collections: IPUMS USA
Topics: Other
Countries: