In this paper, we propose a system for rinding partial positive and negative coregulated gene clusters in microarray data. Genes are clustered together if they show the same pattern of changing tendencies in a user definied number of condition pairs.It is assumed that genes which show similar expression patterns under a number of conditions are under the control of the same transcription factor and are related to a similar function in the cell. Taking positive and negative coregulation of genes intoaccount, we find two types of informational) clusters of genes showing the same changing tendency and (2) relationships between two such clusters whose respective members show opposite changing tendency. Because genes may be coregulated by different transcription factors under different environmental conditions, our algorithm allows the same gene to fall into different clusters. Overlapping gene clusters are allowed because coregulation normally takes place in only a fraction of the investigated condition pairs, and because the gene expression data is noisy so that the approach should be tolerant to errors. In a first step, the gene expression matrix is transformed to a binned matrix of changing tendencies between all condition pairs. For the binningof the gene expression levels, a statistical technique is used, for which no arbitrary threshold needs to be chosen, which automatically corrects for multiple testing, and which is able to handle replicates for the different conditions, immediately accounting for the random variability of gene expression data. To present the results of a clustering a new structure called coregulation graph is proposed.
展开▼