INESC-ID   Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
technology from seed


Knowledge Discovery and Bioinformatics
Inesc-ID Lisboa

Probabilistic Genetic Networks

06/14/2007 - 16:00
06/14/2007 - 17:00

The advent of genomics into malarial research is significantly accelerating the discovery of control strategies. Dynamical global gene expression measures of the intraerythrocytic developmental cycle (IDC) of the parasite at 1h-scale resolution were recently reported. Moreover, by using Discrete Fourier Transform based techniques, it was demonstrated that many genes are regulated in a single periodic manner which allowed to order genes according to the phase of expression. In this work we present a framework to construct genetic
networks from dynamical expression signals. The adopted model to represent these networks is the Probabilistic Genetic Network (PGN). This network is a Markov chain with some additional properties. This model mimics the properties of a gene as a non-linear stochastic gate and the systems are built by coupling of these gates. The PGN estimation is made through the mean conditional entropy minimization to discover subsets of genes which perform the best predictions of the target gene in the posterior time instant. Moreover, a tool that integrates mining of dynamical expression signals by PGN design techniques, different databases and biological knowledge, has been developed. The applicability of this tool for discovering gene networks of the malaria expression regulation system has been validated for simulated data and also for real microarray data using the glycolytic pathway as a gold-standard, as well as by creating an apicoplast as PGN network. Also, a negative control between these two modules was confirmed through construction of PGN networks using four genes from glycolysis
and four from apicoplast organele as seed genes. Together, this data demonstrates the value of the PGN model in generating biologically meaningful networks and which include genes not included by the Fourier approach. Currently, we are applying the same technique for three malarial strains (3D7, Dd2, HB3) in order to analyze similarities and differences among them and to discover whether or not these three data sets may be joint, which would improve the PGN estimation.