Association rule discovery with unbalanced class distributions

Gu, Lifang and Li, Jiuyong and He, Hongxing and Williams, Graham and Hawkins, Simon and Kelman, Chris (2003) Association rule discovery with unbalanced class distributions. In: 16th Australian Conference on Artificial Intelligence (AI 2003), 3-5 Dec 2003, Perth, Australia.

[img]
Preview
Text (Published Version)
Gu_etal_AI2003_PV.pdf

Download (210Kb) | Preview

Abstract

There are many methods for finding association rules in very large data. However it is well known that most general association rule discovery methods find too many rules, many of which are uninteresting rules. Furthermore, the performances of many such algorithms deteriorate when the minimum support is low. They fail to find many interesting rules even when support is low, particularly in the case of significantly unbalanced classes. In this paper we present an algorithm which finds association rules based on a set of new interestingness criteria. The algorithm is applied to a real-world health data set and successfully identifies groups of patients with high risk of adverse reaction to certain drugs. A statistically guided method of selecting appropriate features has also been developed. Initial results have shown that the proposed algorithm can find interesting patterns from data sets with unbalanced class distributions without performance loss.


Statistics for USQ ePrint 11429
Statistics for this ePrint Item
Item Type: Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Refereed: Yes
Item Status: Live Archive
Additional Information: Copyright Springer-Verlag 2003. Permanent restricted access to published version in accordance with the copyright policy of the publisher.
Faculty / Department / School: Historic - Faculty of Sciences - Department of Maths and Computing
Date Deposited: 30 Nov 2007 11:55
Last Modified: 25 Aug 2014 01:47
Uncontrolled Keywords: knowledge discovery; data mining; association rules; record linkage; administrative data; adverse drug reaction
Fields of Research : 01 Mathematical Sciences > 0104 Statistics > 010405 Statistical Theory
08 Information and Computing Sciences > 0806 Information Systems > 080607 Information Engineering and Theory
08 Information and Computing Sciences > 0806 Information Systems > 080608 Information Systems Development Methodologies
Socio-Economic Objective: E Expanding Knowledge > 97 Expanding Knowledge > 970101 Expanding Knowledge in the Mathematical Sciences
Identification Number or DOI: 10.1007/978-3-540-24581-0_19
URI: http://eprints.usq.edu.au/id/eprint/11429

Actions (login required)

View Item Archive Repository Staff Only