Clustering, Fuzzy Clustering and Biclustering: An Overview
Clustering is the process of grouping a set of physical or abstract objects into classes of similar objects called clusters. According to this definition a cluster is a collection of objects similar to one another within the same cluster and dissimilar to the objects in other clusters. In gene expression data analysis, and by using a microarray gene expression matrix, clustering can be used to group genes according to their expression under multiple conditions, group conditions based on the expression of a number of genes, or even to group genes and conditions simultaneously. On the first part of the talk, I will briefly talk about partitional and hierarchical clustering algorithms (“classical clustering”), that partition data objects into several non-overlapping groups (each object belongs to only one cluster). I will then talk about fuzzy clustering algorithms, which are strongly based on the theory of fuzzy sets, and partition data objects into possibly overlapping groups, allowing one object to belong to several clusters with a different membership degree. Finally, I will talk about biclustering, which in the case of microarray data analysis, stands for simultaneous clustering of both genes and conditions.