Analysis of breast feeding data using data mining methods

He, Hongxing and Jin, Huidong and Chen, Jie and McAullay, Damien and Li, Jiuyong and Fallon, Tony (2006) Analysis of breast feeding data using data mining methods. In: 5th Australasian Conference on Data Mining and Analystics (AusDM 2006), 29-30 Nov 2006, Sydney, Australia.


The purpose of this study is to demonstrate the benefit of using common data mining techniques on survey data where statistical analysis is routinely applied. The statistical survey is commonly used to collect quantitative information about an item in a population. Statistical analysis is usually carried out on survey data to test hypothesis. We report in this paper an application of data mining methodologies to breast feeding survey data which have been conducted and analysed by statisticians. The purpose of the research is to study the factors leading to deciding whether or not to breast feed a new born baby. Various data mining methods are applied to the data. Feature or variable selection is conducted to select the most discriminative and least redundant features using an information theory based method and a statistical approach. Decision tree and regression approaches are tested on classification tasks using features selected. Risk pattern mining method is also applied to identify groups with high risk of not breast feeding. The success of data mining in this study suggests that using data mining approaches will be applicable to other similar survey data. The data mining methods, which enable a search for hypotheses, may be used as a complementary survey data analysis tool to traditional statistical analysis.

Statistics for USQ ePrint 2105
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: Reproduction for academic, not-for profit purposes permitted provided the item is acknowledged. Series title: Conferences in Research and Practice in Information Technology, vol. 61
Faculty/School / Institute/Centre: Historic - Faculty of Sciences - Department of Maths and Computing (Up to 30 Jun 2013)
Faculty/School / Institute/Centre: Historic - Faculty of Sciences - Department of Maths and Computing (Up to 30 Jun 2013)
Date Deposited: 11 Oct 2007 00:57
Last Modified: 09 Oct 2013 07:07
Uncontrolled Keywords: data mining; survey data; features selection; association rule; classification
Fields of Research (2008): 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
11 Medical and Health Sciences > 1117 Public Health and Health Services > 111799 Public Health and Health Services not elsewhere classified
01 Mathematical Sciences > 0104 Statistics > 010402 Biostatistics
Fields of Research (2020): 46 INFORMATION AND COMPUTING SCIENCES > 4699 Other information and computing sciences > 469999 Other information and computing sciences not elsewhere classified
42 HEALTH SCIENCES > 4299 Other health sciences > 429999 Other health sciences not elsewhere classified
49 MATHEMATICAL SCIENCES > 4905 Statistics > 490502 Biostatistics
Socio-Economic Objectives (2008): E Expanding Knowledge > 97 Expanding Knowledge > 970101 Expanding Knowledge in the Mathematical Sciences

Actions (login required)

View Item Archive Repository Staff Only