This webpage makes available a prototype implementation of the CCC-Biclustering algorithm coded in Java together with the datasets and examples used in the paper:

Sara C. Madeira, Miguel C. Teixeira, Isabel Sá Correia and Arlindo L. Oliveira, "Identification of Regulatory Modules in Time Series Gene Expression Data using a Linear Time Biclustering Algorithms", IEEE/ACM Transactions on Computational Biology and Bioinformtaics (to appear). [DOI Article Link]

## Synthetic

- Randomly generated 1500x50 matrix
Randomly generated 1500x50 matrix with 10 planted CCC-Biclusters
## Real

Randomly generated 1500x50 matrix with 10 planted CCC-Biclusters
- Sorted by statistical significance p-value

Sorted by statistical significance p-value
Sorted by statistical significance p-value, filtered statistical p-values not passing the statistical test at 1% level (after Bonferroni correction)
Sorted by statistical significance p-value, filtered statistical p-values not passing the statistical test at 1% level (after Bonferroni correction), filtered similarities above 25%

- Cell Cycle
- Heat Stress

- Sorted by statistical significance
Sorted by statistical significance p-value
Sorted by statistical significance p-value, filtered statistical p-values not passing the statistical test at 1% level (after Bonferroni correction)
Sorted by statistical significance p-value, filtered statistical p-values not passing the statistical test at 1% level (after Bonferroni correction), filtered similarities above 25%

The software available here allows the reproduction of the results in the paper and also the execution of the CCC-Biclustering algorithm using a gene expression matrix provided by the user. The gene expression matrix must be a .txt file formatted as in the examples provided below.

The algorithm is
coded in ** Java**.
Before running the examples below please make sure the version of

In order to run
the algorithm copy the ** .jar**
file together with the

If you have any questions please contact Sara C. Madeira.

## Reproduce Results in the Paper

java -jar -Xss50M -Xms1024M -Xmx1024M Test_TCBB_CCC_Biclustering_Synthetic.jar

- Synthetic Data
Synthetic Data

java -jar -Xss50M -Xms1024M -Xmx1024M Test_TCBB_Cell_Cycle.jar

- Cell Cycle Data
Cell Cycle Data

- Heat Stress Data
Heat Stress Data
## Run CCC-Biclustering with Other Datasets

java -jar -Xss50M -Xms1024M -Xmx1024M Test_TCBB_CCC_Biclustering.jar yourExpressionMatrix.txt overlapping

yourExpressionMatrix.txt - name of the .txt file containing your expression matrix

overlapping - float in [0,1] containing the maximum percentage of overlapping allowed (all CCC-Biclusters overlapping more than this value are filtered)

The CCC-Biclustering
algorithm (together with extended versions allowing missing values and
the discovery of anticorrelated and scaled expression patterns) is
integrated in the software BiGGEsTS
(Biclustering Gene Expression Time Series), a free and open source
software tool providing an integrated environment for the biclustering
analysis of time series gene expression data. This software enables a
user-friendly usage of the algorithm in a graphical
environment
together with the possibility to preprocess the data and
postprocess
and analyse the results using several criteria.

Last Update: July 2009