INESC-ID   Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento em Lisboa
-
technology from seed

kdbio

Knowledge Discovery and Bioinformatics
Inesc-ID Lisboa
Home
 
 

Seminars

Global Tolerance of Biochemical Systems and its Design Implications

05/21/2010 - 14:00
05/21/2010 - 15:00
Etc/GMT

The ability of organisms to survive under a multitude of conditions is readily apparent. This robustness in performance is difficult to precisely characterize and quantify. At a biochemical level, it leads to physiological behavior when the parameters of the system remain within some neighborhood of their normal values. However, this behavior can change abruptly, often becoming pathological, as the boundary of the neighborhood is crossed. Currently, there is no generic approach to identifying and characterizing such boundaries. We address the problem by introducing a method that involves quantitative concepts for boundaries between regions and “global tolerance”. To illustrate the power of these concepts, we analyzed a large class of biological modules called moiety-transfer cycles and characterized the specific case of the NADPH redox cycle in human erythrocytes, which is involved in conferring resistance to malaria. Our results show that the wild-type system operates well within a region of “best” local performance that is surrounded by “poor” regions.

“On-the-Fly Model Checking for Regular Alternation-Free Mu-Calculus” and “One Interface to Serve Them All”

05/17/2010 - 15:00
05/17/2010 - 16:30
Etc/GMT

"On-the-Fly Model Checking for Regular Alternation-Free Mu-Calculus”:

Model-checking is a successful technique for automatically verifying concurrent finite-state systems. When designing a model-checker, a good compromise must be made between the expressive power of the property description formalism, the complexity of the model-checking problem, and the user-friendliness of the interface. The logic we adopted is the regular alternation-free mu-calculus, an extension of the alternation-free mu-calculus with ACTL-like action formulas and PDL-like regular expressions, allowing a concise and intuitive description of safety, liveness, and (some) fairness properties over labelled transition systems (LTSs). The model-checking method consists in reformulating the verification problem as the local resolution of a boolean equation system (BES), which is carried out by linear-time algorithms based on various strategies (depth-first search, breadth-first search, etc.). These algorithms also generate full diagnostic information (examples and counterexamples) illustrating the truth value of temporal formulas. This method is at the heart of the EVALUATOR 3.5 model-checker of the CADP toolbox, developed using the generic OPEN/CAESAR environment for on-the-fly verification. The BES resolution algorithms are provided by the CAESAR_SOLVE library of OPEN/CAESAR.

“One Interface to Serve Them All” :

I will explain a new interface for model checking tools, which enables o connect many model checking algorithms (explicit, distributed, symbolic, multi-core) to various input languages (Promela, mCRL, DVE). Also, several generic optimizations can be applied at the interface. Additionally, I will demonstrate the application of model checking to safety requirements of railway interlocking systems, in a project led by the UIC (Union Internationale des Chemins de Fer).

From Assembling Short DNA Reads to Protein Sequencing by Assembling Mass Spectra

04/26/2010 - 15:00
Etc/GMT

Increasing read length is viewed as the crucial condition for fragment assembly with next-generation sequencing technologies. However, introducing mate-paired reads (separated by a gap of length GapLength) opens a possibility to transform short mate-pairs into long mate-reads of length approximately GapLength, and thus raises the question as to whether the read length (as opposed to GapLength) even matters. We describe a new tool for assembling mate-paired short reads and use it to analyze the question of whether the read length matters. We further complement the ongoing experimental efforts to maximize read length by a new computational approach for increasing the effective read length. While the common practice is to trim the error-prone tails of the reads, we present an approach that substitutes trimming with error correction using repeat graphs. An important and counterintuitive implication of this result is that one may extend sequencing reactions that degrade with length "past their prime" to where the error rate grows above what is normally acceptable for fragment assembly.

If time allows, we will further address the problem of sequencing molecules that are not directly inscribed in the genomes (e.g., antibodies or antibiotics-like non-ribosomal peptides) and propose to assemble them from tandem mass spectra.

We show that our Eulerian approach to DNA sequencing can be generalized to Shotgun Protein Sequencing (SPS). We illustrate applications of SPS to sequencing of snake venoms (collaborations with Karl Clauser at Broad Institute) and antibodies (collaboration with Jennie Lill at Genentech). We further show how mass-spectrometry enables de novo sequencing of peptide-like natural products. This is a joint work with Nuno Bandeira (UCSD), Mark Chaisson (Pacific Biosciences) and Dima Brinza (Life Technologies).

A Data Mining Approach for the detection of High-Risk Breast Cancer Groups

04/09/2010 - 14:00
04/09/2010 - 15:00
Etc/GMT

It is widely agreed that complex diseases are typically caused by the joint effects of multiple instead of a single genetic variation. These genetic variations may show very little effect individually but strong effect if they occur jointly, a phenomenon known as epistasis or multilocus interaction. In this seminar, we explore the applicability of decision trees to this problem. A case-control study was performed, composed of 164 controls and 94 cases with 32 SNPs available from the BRCA1, BRCA2 and TP53 genes. There was also information about tobacco and alcohol consumption. We used a Decision Tree to find a group with high susceptibility of suffering from breast cancer. Our goal was to find one or more leaves with a high percentage of cases and small percentage of controls. To statistically validate the association found, permutation tests were used. We found a high-risk breast cancer group composed of 13 cases and only 1 control, with a Fisher Exact Test value of 9:7 * 10^-6. After running 10000 permutation tests we obtained a p-value of 0.017. These results show that it is possible to find statistically significant associations with breast cancer by deriving a decision tree and selecting the best leaf.

Human Immunodeficiency Virus (HIV) Dynamic Modeling and Antiretroviral Treatment Analysis

03/26/2010 - 14:00
03/26/2010 - 15:00
Etc/GMT

Location(s)

INESC-ID
Portugal

The emergence of the Acquired Immune Deficiency Syndrome (AIDS) raised new problems and concerns worldwide. HIV-AIDS is now a global disease which has great influence in people’s lives, especially in developing countries and controlling this type of disease has a significant socio-economical impact. HIV virus attacks preferentially CD4+ T immune cells, incorporating its DNA, which was previously transcript from a viral RNA, into the cells’ genome. Antiretroviral treatments act in different stages of HIV’s infection, decreasing the organism’s viral loads. The simplest dynamic models for HIV’s infection behavior relate the concentrations of healthy and infected CD4+T cells with viral load. Currently there are more complex models which compare more state variables and parameters. For a better knowledge of the disease it is essential to cross information obtained through these mathematical models with data from infected people. The goal of this project is to understand HIV’s complexity and to explore its dynamic, through a mathematical model based on nonlinear differential equations. Biomedical Engineering is expected to have a crucial role in the development of new tools and techniques for discovering a potential AIDS cure.

In silico Metabolic Engineering

03/12/2010 - 14:00
03/12/2010 - 15:00
Etc/GMT

Location(s)

INESC-ID
Portugal

Metabolic Engineering (ME) deals with designing organisms with enhanced capabilities regarding the productivities of desired compounds. This field has received increasing attention within the last few years due to the extraordinary growth in the adoption of white or industrial biotechnological processes for the production of bulk chemicals, pharmaceuticals, food ingredients and enzymes, among other products. Many different approaches have been used to aid in ME efforts that take available models of metabolism together with mathematical tools and/ or experimental data to identify metabolic bottlenecks or targets for genetic engineering. Our conceptual framework in the development of tools for in silico ME relies on three layers: accurate mathematical models (stoichiometric models, regulatory networks, dynamic models), good simulation methods (e.g. steady state simulations with flux balance analysis, Boolean network simulation, numerical integration of ODEs) and robust optimization algorithms. This framework gave rise to the OptFlux platform, an open-source, user-friendly and modular software aimed at being the reference computational platform for ME applications. Indeed, the rational design of microbial strains has been limited to the developers of the techniques, since a platform that provides a user friendly interface to perform such tasks was not yet available. OptFlux aims to change this situation, by providing the following features: freely available, open-source, user-friendly, modular and compatible with standards such as the Systems Biology Markup Language (SBML) and the layout information of CellDesigner. The main methods allow the simulation of both wild-type and mutant organisms (using Flux Balance Analysis or other methods) and optimization tasks, i.e., the identification of ME targets can be performed with metaheuristics such as Evolutionary Algorithms, Simulated Anneali

CHE - Evolutionary Algorithms for Cluster Geometry Optimization

02/26/2010 - 14:00
02/26/2010 - 15:00
Etc/GMT

Location(s)

INESC-ID
Portugal

CHE is a joint research project involving the Evolutionary and Complex Systems group (ECOS-CISUC) and the Coimbra Chemistry Centre from the University of Coimbra. It aims to develop effective bio-inspired algorithms for difficult optimization problems from the theoretical chemistry area. This talk will focus on cluster geometry optimization. In this problem, the goal is to determine the structural organization for a set of atoms or molecules that minimizes the total potential energy. Determining the relative position of the particles that compose a cluster is essential, as it helps to understand the chemical properties of the aggregate. In this talk we present a simple and unbiased evolutionary algorithm that can effectively tackle hard cluster geometry optimization problems. Additionally, we will identify some key components that are essential to improve the performance of the optimization method.

Novelty and Evolution in Biological, Chemical and Random Reaction Networks

02/19/2010 - 14:00
02/19/2010 - 15:00
Etc/GMT

Location(s)

INESC-ID
Portugal

We shall first investigate how reaction networks are at the base of the various disciplines. We will then learn a method (Chemical Organization Theory) that helps us to study novelty in reaction networks, and see what results does this method gives us in various example of Reaction Network. Both Artificial and non. From artificial chemistries to simulated ecologies, to agent based models, to chemistry.The talk will also touch and start to frame the (unsolved) problem of how to generate reaction networks that can sustain evolution.

Formal verification techniques: model checking in systems biology

02/12/2010 - 14:00
02/12/2010 - 15:00
Etc/GMT

Location(s)

INESC-ID
Portugal

The study of biological networks has led to the development of increasingly large and detailed models. While whole-cell models are not on the horizon yet, complex networks underlying specific cellular processes have been modeled in detail. The study of these models by means of analysis and simulation tools leads to large amounts of predictions, typically time-courses of the concentration of several dozens of molecular components in a variety of physiological conditions and genetic backgrounds. This raises the question how to make sense of these simulations, that is, how to obtain an understanding of the way in which particular molecular mechanisms control the cellular process under study, and how to identify interesting predictions of novel phenomena that can be confronted with experimental data. Formal verification techniques based on model-checking provide a powerful technology to keep up with this increase in scale and complexity. The basic idea underlying model checking is to specify dynamical properties of interest as statements in temporal logic, and to use model-checking algorithms to automatically and efficiently verify whether the properties are satisfied or not by the model. The application of model-checking techniques is hampered, however, by several key issues described in this thesis. First, the systems biology domain brought to the fore a few properties of the network dynamics that are not easily expressed using classical temporal logics, like Computation Tree Logic (Ctl) and Linear Time Logic (Ltl). On the one hand, questions about multistability are important in the analysis of biological regulatory networks, but difficult (or impossible) to express in Ltl. On the other hand, Ctl is capable of dealing with branching time, important for multistability and other properties of non-deterministic models, but it does not do a good job when faced with questions about cycles in a Kripke structure. Second, the problem of posing relevant and interesting questions is critical in modeling in general, but even more so in the context of applying model-checking techniques, due to the fact that it is not easy for non-experts to formulate queries in temporal logic. Finally, most of the existing modelling and simulation tools are not capable of applying model-checking techniques in a transparent way. In particular, they do not hide from the user the technical details of the installation of the model checker, the export in a suitable format of the model and the query, the call of the model checker, and the import of the results produced by the model checker (the true/false verdict and witness/counterexample). This report starts by describing the basic concepts for formal verification, introducing the necessary data structures that represent the possible behaviors of a dynamical system, as well as the different types of temporal logics necessary for the encoding the properties. It also presented the generic model checking problem, which consists in determining if a given system satisfies a given set of properties. Some recent examples of the application of model checking techniques in system biology are also presented, as well as the current problems and limitations that need to be addressed. Finally, some concluding remarks are presented pointing some directions towards a better integration between the formal verification and the systems biology fields.

Project "EnviGP - Improving Genetic Programming for the Environment and Other Applications"

01/29/2010 - 14:00
01/29/2010 - 15:00
Etc/GMT

Location(s)

INESC-ID
Portugal

EnviGP is a FCT project involving INESC-ID, the University of Coimbra, the Tropical Research Institute in Lisbon, and the University of Milano-Bicocca, Italy. Genetic Programming (GP) is a population-based search procedure that, although powerful and versatile, still faces a few obstacles to its fully successful usage in the real world. After a brief introduction to GP, the main subjects of the project are informally addressed: bloat, overfitting, complexity, and the still poorly understood relationship between them. Current results and work in progress are described and illustrated with examples. All these subjects are addressed from the practical point of view of the non computer science practitioners that share the same goal as the EnviGP project: having GP provide accurate, simple, and effectively useful/usable solutions to their real-life problems. Discussion is highly encouraged. Among the audience there will be team members from the Tropical Research Institute and from the University of Milano-Bicocca.