A novel dimensionality reduction technique based on independent component analysis for modeling microarray gene expression data

Liu, Han and Kustra, Rafal and Zhang, Ji (2004) A novel dimensionality reduction technique based on independent component analysis for modeling microarray gene expression data. In: 2004 International Conference on Artificial Intelligence (ICAI'04), 21-24 June 2004, Las Vegas, Nevada, USA.

Metadata

HTML CitationEndNoteDublin CoreReference Manager

Full text available as:

[img]
Preview
PDF (Accepted Version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
333Kb

Official URL: http://dblp.uni-trier.de/db/conf/icai/icai2004-1.html

Abstract

DNA microarray experiments generating thousands of gene expression measurements, are being used to gather information from tissue and cell samples regarding gene expression differences that will be useful in diagnosing disease. But one challenge of microarray studies is the fact that the number n of samples collected is relatively small compared to the number p of genes per sample which are usually in thousands. In statistical terms this very large number of predictors compared to a small number of samples or observations makes the classification problem difficult. This is known as the ”curse of dimensionality problem”. An efficient way to solve this problem is by using dimensionality reduction techniques. Principle Component Analysis(PCA) is a leading method for dimensionality reduction of gene expression data which is optimal in the sense of least square error. In this paper we propose a new dimensionality reduction technique for specific bioinformatics applications based on Independent component Analysis(ICA). Being able to exploit higher order statistics to identify a linear model result, this ICA based dimensionality reduction technique outperforms PCA from both statistical and biological significance aspects. We present experiments on NCI 60 dataset to show this result.

Item Type:Conference or Workshop Item (Commonwealth Reporting Category E) (Paper)
Additional Information:No evidence of copyright restrictions.
Uncontrolled Keywords:gene expression data, dimensionality reduction, independent component analysis, latent regulatory factors
Fields of Research (FOR2008):08 Information and Computing Sciences > 0801 Artificial Intelligence and Image Processing > 080109 Pattern Recognition and Data Mining
Subjects:280000 Information, Computing and Communication Sciences
Socio-Economic Objective (SEO2008):UNSPECIFIED
ID Code:5633
Deposited By:
Deposited On:25 Sep 2009 14:51
Last Modified:04 Aug 2011 12:55

Archive Staff Only: edit this record