A maximally diversified multiple decision tree algorithm for microarray data classification

Hu, Hong and Li, Jiuyong and Wang, Hua and Daggard, Grant and Shi, Mingren (2006) A maximally diversified multiple decision tree algorithm for microarray data classification. In: Workshop on Intelligent Systems for Bioinformatics, 4 Dec 2006, Hobart, Australia.

Metadata

HTML CitationEndNoteDublin CoreReference Manager

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
122Kb

Abstract

We investigate the idea of using diversified multiple trees for Microarray data classification. We propose an algorithm of Maximally Diversified Multiple Trees (MDMT), which makes use of a set of unique trees in the decision committee. We compare MDMT with some well-known ensemble methods, namely AdaBoost, Bagging, and Random Forests. We also compare MDMT with a diversified decision tree algorithm, Cascading and Sharing trees (CS4), which forms the decision committee by using a set of trees with distinct roots. Based on seven Microarray data sets, both MDMT and CS4 are more accurate on average than AdaBoost, Bagging, and Random Forests. Based on a sign test of 95% confidence, both MDMT and CS4 perform better than majority traditional ensemble methods tested. We discuss differences between MDMT and CS4.

Item Type:Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Additional Information:Deposited in accordance with the copyright policy of the publisher (ACS Press). Published in the CRPIT series of the Australian Computer Society.
Uncontrolled Keywords:ensemble classifier, diversified classifiers, decision tree, Microarray data
Fields of Research (FOR2008):08 Information and Computing Sciences > 0899 Other Information and Computing Sciences > 089999 Information and Computing Sciences not elsewhere classified
08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
06 Biological Sciences > 0604 Genetics > 060405 Gene Expression (incl. Microarray and other genome-wide approaches)
Subjects:270000 Biological Sciences > 270800 Biotechnology > 270899 Biotechnology not elsewhere classified
280000 Information, Computing and Communication Sciences > 280200 Artificial Intelligence and Signal and Image Processing > 280213 Other Artificial Intelligence
Socio-Economic Objective (SEO2008):UNSPECIFIED
ID Code:2097
Deposited By:
Deposited On:11 Oct 2007 10:57
Last Modified:10 Feb 2012 15:38

Archive Staff Only: edit this record