The Biodegradation and Surfactants Database
Submitted by ptgm on Thu, 06/05/2014 - 19:56.The Biodegradation and Surfactants Database (BioSurfDB) is a curated relational information system currently integrating 14 metagenomes, 137 organisms, 73 biodegradation relevant genes, 62 proteins and 6 of their metabolic pathways; 29 documented bioremediation experiments, with specific pollutants treatment efficiencies by surfactant producing organisms; and a 46 biosurfactants curated list, grouped by producing organism, surfactant name and class and reference.
Data integration tools for pre-processing biological data
Submitted by ptgm on Thu, 05/22/2014 - 23:47.The increasing use of Electronic Health Records (EHRs) enables a better analysis of patient data, improving the quality of medical care. EHRs must be processed in order to provide a variety of services to the physician, such as risk classification and summarization. EHRs usually are stored in unstructured text or Excel files containing different data formats and types, missing information, and, sometimes, inconsistent information. Therefore, before analyzing the data, we often need to transform and integrate it.
Integrative biomarker discovery in neurodegenerative diseases: a survey
Submitted by ptgm on Thu, 04/17/2014 - 21:24.Data mining has been widely applied in biomarker discovery, resulting in
significant findings of different clinical and biological biomarkers. With
developments in technology, from genomics to proteomics analysis, a deluge
of data has become available, as well as standardized data repositories.
Nonetheless, researchers are still facing important challenges in
analyzing the data, especially when considering the complexity of pathways
involved in biological processes or diseases. Data from single sources
seem unable to explain complex processes, such as the ones involved in
Novel metric for the use of Minimum Spanning Trees in phylogenetic trees studies
Submitted by ptgm on Mon, 03/31/2014 - 09:22.The use of trees for phylogenetic representations started in the
middle of the 19th century. One of their most popular uses is Charles
Darwin's sole illustration in "The Origin of Species" [4]. The
simplicity of the tree representation makes it still the method of
choice today to easily convey the diversification and relationships
between species. Yet trees suffer from several drawbacks that are not
always clear to researchers. Since several different algorithms can be
used to infer and draw the tree, one must be aware of each algorithm's
set of assumptions.
Extracting academic data and linked data anonymization
Submitted by ptgm on Fri, 03/14/2014 - 10:39.Data is becoming more valuable each day as more diverse and rich
data sources become available, allowing us to discover knowledge
on unprecedented ways.
IST uses FénixEdu information system for managing most of internal
data. The system contains data about students, teachers, employees,
courses, and all major aspects of IST as an organization. Such data
may be useful for both external agents and, more importantly, for IST
itself to study our academic environment. Data may be used as input
for state-of-art IR and KD technologies to extract newer and deeper
Network mining based analysis of whole brain functional connectivity
Submitted by lsr on Fri, 02/28/2014 - 14:52.Mapping the human brain has been a topic of interest for the last few
decades. In spite of its incredible complexity it is now possible to
map the brain using a combination of advanced data representation and
data processing algorithms supported on the huge computational power
that is available nowadays. In this work we describe an approach for
mapping whole-brain functional connectivity. The starting point of our
work is a set of high resolution functional magnetic resonance images
(fMRI) obtained with a 7T magnetic field that cover a wider brain
Computational prediction of microRNA targets in plant genomes
Submitted by lsr on Mon, 02/03/2014 - 09:56.MicroRNAs (miRNAs) are important posttranscriptional regulators and
act by recognizing and binding to sites in their target messenger RNAs
(mRNAs). They are present in nearly all eukaryotes, in particular in
plants, where they play important roles in developmental and stress
response processes by targeting mRNAs for cleavage or translational
repression. MiRNAs have been shown to have a crucial role in gene
expression regulation, but so far only a few miRNA targets in plants
have been experimentally validated. Based on the number of identified
Design and Implementation of a Domain Specific Language for Next Generation Sequence Analysis
Submitted by lsr on Thu, 01/30/2014 - 15:37.Next Generation Sequecing (NGS) is a set of molecular biology technologies
which generate, at low cost, many millions of short nucleotide reads. Typical
datasets consist of tens of millions of reads, with each read comprising 35-500
basepairs (depending on the technology used, different read sizes can be
obtained).
There are many tools for handing these datasets. However, they must still be
combined to build a full analysis pipeline. Current solutions to build these
pipelines are Make-like tools which can handle text-files and Unix-like
Design and Implementation of a Domain Specific Language for Next Generation Sequence Analysis
Submitted by lsr on Thu, 01/30/2014 - 15:31.Next Generation Sequecing (NGS) is a set of molecular biology technologies
which generate, at low cost, many millions of short nucleotide reads. Typical
datasets consist of tens of millions of reads, with each read comprising 35-500
basepairs (depending on the technology used, different read sizes can be
obtained).
There are many tools for handing these datasets. However, they must still be
combined to build a full analysis pipeline. Current solutions to build these
pipelines are Make-like tools which can handle text-files and Unix-like
Identification and quantification of reachable attractors over asynchronous discrete dynamics
Submitted by lsr on Tue, 12/10/2013 - 09:38.Models of discrete concurrent systems often lead to huge and complex
state transition graphs that represent their dynamics.
Here, we are particularly interested in logical models of biological
regulatory networks. Given an initial condition, it is of real interest
to identify reachable attractors that denote the potential asymptotical
behaviours of the system. These attractors are described as terminal
strongly connected components, that are either single (stable) states or
sets of states (denoting cyclical behaviours).