Hu, Hong and Li, Jiuyong and Plank, Ashley and Wang, Hua and Daggard, Grant (2006) A comparative study of classification methods for microarray data analysis. In: 5th Australasian Data Mining Conference (AusDM 2006), 29-30 Nov 2006, Sydney, Australia.
Metadata
| HTML Citation | EndNote | Dublin Core | Reference Manager |
Full text available as:
| PDF (Published Version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader 114Kb |
Official URL: http://www.crpit.com/confpapers/CRPITV61Hu.pdf
Abstract
In response to the rapid development of DNA Microarray technology, many classification methods have been used for Microarray classification. SVMs, decision trees, Bagging, Boosting and Random Forest are commonly used methods. In this paper, we conduct experimental comparison of LibSVMs, C4.5, BaggingC4.5, AdaBoostingC4.5, and Random Forest on seven Microarray cancer data sets. The experimental results show that all ensemble methods outperform C4.5. The experimental results also show that all five methods benefit from data preprocessing, including gene selection and discretization, in classification accuracy. In addition to comparing the average accuracies of ten-fold cross validation tests on seven data sets, we use two statistical tests to validate findings. We observe that Wilcoxon signed rank test is better than sign test for such purpose.
| Item Type: | Conference or Workshop Item (Commonwealth Reporting Category E) (Paper) |
|---|---|
| Additional Information: | Deposited in accordance with the copyright policy of the publisher (ACS Press) |
| Uncontrolled Keywords: | microarray data, classification |
| Fields of Research (FOR2008): | 01 Mathematical Sciences > 0104 Statistics > 010401 Applied Statistics 08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining 06 Biological Sciences > 0604 Genetics > 060405 Gene Expression (incl. Microarray and other genome-wide approaches) |
| Subjects: | 270000 Biological Sciences > 270800 Biotechnology > 270899 Biotechnology not elsewhere classified 280000 Information, Computing and Communication Sciences |
| Socio-Economic Objective (SEO2008): | UNSPECIFIED |
| ID Code: | 2095 |
| Deposited By: | |
| Deposited On: | 11 Oct 2007 10:57 |
| Last Modified: | 27 Feb 2012 14:48 |
Archive Staff Only: edit this record
