Pattern analysis of microarray data: gene clustering, gene selection, and sample classification
Modern microarray technology provides thousands of gene expression values for each sample. This large amount of data can be analyzed from several perspectives and with different goals. Although standard pattern recognition, machine learning, or statistical analysis methods can be called into action, gene expression data have specific characteristics which demand some special care. For example, in sample classification, one often has to deal with just a few samples (say 10 to 100) in a very high dimensional space (i.e., number of genes, say 1000 to 10000). This is a very unusual situation in most other domains.
In this talk, I will briefly overview three basic problems in the analysis of gene expression data (gene clustering, gene selection, and sample classification). I will then describe, in a little more detail, a recent approach which is able to simultaneous learn how to classify samples and select which genes are relevant for this purpose.