He, Hongxing and Jin, Huidong and Chen, Jie and McAullay, Damien and Li, Jiuyong and Fallon, Anthony Bruce (2006) Analysis of breast feeding data using data mining methods. In: AusDM 2006: 5th Australasian Conference on Data Mining and Analystics, 29-30 Nov 2006, Sydney, Australia.
Metadata
| HTML Citation | EndNote | Dublin Core | Reference Manager |
Full text available as:
| PDF (Published Version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader 363Kb |
Abstract
The purpose of this study is to demonstrate the benefit of using common data mining techniques on survey data where statistical analysis is routinely applied. The statistical survey is commonly used to collect quantitative information about an item in a population. Statistical analysis is usually carried out on survey data to test hypothesis. We report in this paper an application of data mining methodologies to breast feeding survey data which have been conducted and analysed by statisticians. The purpose of the research is to study the factors leading to deciding whether or not to breast feed a new born baby. Various data mining methods are applied to the data. Feature or variable selection is conducted to select the most discriminative and least redundant features using an information theory based method and a statistical approach. Decision tree and regression approaches are tested on classification tasks using features selected. Risk pattern mining method is also applied to identify groups with high risk of not breast feeding. The success of data mining in this study suggests that using data mining approaches will be applicable to other similar survey data. The data mining methods, which enable a search for hypotheses, may be used as a complementary survey data analysis tool to traditional statistical analysis.
| Item Type: | Conference or Workshop Item (Commonwealth Reporting Category E) (Paper) |
|---|---|
| Additional Information: | Copyright !c 2006, Australian Computer Society, Inc. This paper appeared at Australasian Data Mining Conference (AusDM 2006), Sydney, December 2006. Conferences in Research and Practice in Information Technology (CRPIT), Vol. 61. Peter Christen, Paul Kennedy, Jiuyong Li, Simeon Simoff and Graham Williams, Ed. Reproduction for academic, not-for profit purposes permitted provided this text is included. |
| Uncontrolled Keywords: | data mining; survey data; features selection; association rule; classification |
| Fields of Research (FOR2008): | 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining 11 Medical and Health Sciences > 1117 Public Health and Health Services > 111799 Public Health and Health Services not elsewhere classified 01 Mathematical Sciences > 0104 Statistics > 010402 Biostatistics |
| Subjects: | 280000 Information, Computing and Communication Sciences > 280200 Artificial Intelligence and Signal and Image Processing > 280213 Other Artificial Intelligence 320000 Medical and Health Sciences > 321200 Public Health and Health Services > 321299 Public Health and Health Services not elsewhere classified |
| Socio-Economic Objective (SEO2008): | E Expanding Knowledge > 97 Expanding Knowledge > 970101 Expanding Knowledge in the Mathematical Sciences |
| ID Code: | 2105 |
| Deposited By: | |
| Deposited On: | 11 Oct 2007 10:57 |
| Last Modified: | 28 Aug 2012 11:39 |
Archive Staff Only: edit this record
