Full Citation
Title: Applying pattern discovery methods to a healthcare data
Citation Type: Dissertation/Thesis
Publication Year: 2009
ISBN:
ISSN:
DOI:
NSFID:
PMCID:
PMID:
Abstract: Data mining is one of the most exciting information science technologies in 21 century. It has become an important mechanism that is able to interpret the information hidden in data to human-understandable knowledge. It has been heavily involved in a wide range of profiling practices, these include finance, marketing, bioinformatics, genetic and medicine study, etc. The data to be studied, in terms of their properties and relations, can vary greatly form relational data, sequential data, graphs, models to classifiers, or the combinations of these. Different data mining methods and algorithms can be adopted to analyse different forms of data presentation so that the results are assured to be interpretable and understandable. Contrast patterns mining, more generally speaking, contrast groups mining or contrast sets mining, is one of the most challenging and vital techniques in data mining research. Patterns, or groups, are collections of items which satisfy certain properties which are of interesting information [1]. In other words, patterns represent different classes of objects, for example, American male and Russian male, or the income changes in 2004 through 2009. Contrast patterns are the conjunctions of attributes ad values that distinguish meaningfully in their distribution across groups [2]. Contrast patterns of various kinds differ greatly, for example, Pattern and rule based contrasts, Data cube contrasts, Sequence based contrasts, Graph based and Model based contrasts. However, there is no one specific paper or research lays the emphasis on comparing the similarities and differences between them. This research, therefore, is intended to make a clear and comprehensive comparison of different contrast patterns techniques. It firstly provides background knowledge, which gives a grounding in data mining; then annotations on relevant literature is shown along with the summary of deficiency in different algorithms being implemented in various contrast sets. The thesis also provides a critical survey of existing contrast patterns discovery methods. One of the major data sources used in the research is from Domiciliary Care SA, a government organization which takes care of disables and elderly people. Different algorithms discussed in this thesis will employ the same data sauce from Domiciliary Care SA to ensure that the results generated are comparable. A detailed description of the data is presented on Chapter 4.
Url: https://wiki.cis.unisa.edu.au/wki/images/0/00/Lu-thesis.docx
User Submitted?: No
Authors: Lu, Xun
Institution: UniSA
Department: School of Computer and Information Science
Advisor:
Degree:
Publisher Location:
Pages: 96
Data Collections: IPUMS USA
Topics: Health
Countries: