Zhang, Zhongwei and Li, Jiuyong and Hu, Hong and Zhou, Hong (2010) On the effectiveness of gene selection for microarray classification methods. In: 2nd Asian Conference on Intelligent Information and Database Systems (ACIIDS 2010), 24-26 Mar 2010, Hue City, Vietnam.
|HTML Citation||EndNote||Dublin Core||Reference Manager|
Full text available as:
|PDF (Accepted Version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader|
Identification Number or DOI: doi: 10.1007/978-3-642-12101-2_31
Microarray data usually contains a high level of noisy gene data, the noisy gene data include incorrect, noise and irrelevant genes. Before Microarray data classification takes place, it is desirable to eliminate as much noisy data as possible. An approach to improving the accuracy and efficiency of Microarray data classification is to make a small selection from the large volume of high dimensional gene expression dataset. An effective gene selection helps to clean up the existing Microarray data and therefore the quality of Microarray data has been improved. In this paper, we study the effectiveness of the gene selection technology for Microarray classification methods. We have conducted some experiments on the effectiveness of gene selection for Microarray classification methods such as two benchmark algorithms: SVMs and C4.5. We observed that although in general the performance of SVMs and C4.5 are improved by using the preprocessed datasets rather than the original data sets in terms of accuracy and efficiency, while an inappropriate choice of gene data can only be detrimental to the power of prediction. Our results also implied that with preprocessing, the number of genes selected affects the classification accuracy.
|Item Type:||Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)|
|Additional Information:||Author's version deposited in accordance with the copyright policy of the publisher. Copyright 2010 Springer. This is the author's version of a paper published in the series Lecture Notes in Artificial Intelligence, v. 5991, 2010. Author's version deposited in accordance with the copyright policy of the publisher, Springer.|
|Uncontrolled Keywords:||classification accuracy; clean up; data sets; gene selection; high-dimensional; microarray classification; microarray data; noisy data|
|Fields of Research (FOR2008):||08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining|
11 Medical and Health Sciences > 1117 Public Health and Health Services > 111711 Health Information Systems (incl. Surveillance)
06 Biological Sciences > 0604 Genetics > 060405 Gene Expression (incl. Microarray and other genome-wide approaches)
|Socio-Economic Objective (SEO2008):||C Society > 92 Health > 9204 Public Health (excl. Specific Population Health) > 920413 Social Structure and Health|
|Deposited On:||13 Jul 2010 10:35|
|Last Modified:||21 Dec 2011 15:36|
Archive Staff Only: edit this record