Motif representation and discovery

Submitted by smadeira on Thu, 03/11/2010 - 20:57.

Start: 01/08/2010 - 14:00

End: 01/08/2010 - 15:00

Timezone: Etc/GMT

An important part of gene regulation is mediated by specific proteins, called transcription factors (TF), which influence the transcription of a particular gene by binding to specific sites on DNA sequences, called transcription factor binding sites (TFBS). Such binding sites are relatively short stretches of DNA, normally 5 to 25 nucleotides long. A commonly used representation of TFBS is a position specific scoring matrices (PSSM) which assumes independence of nucleotides in the binding sites. Recently, some works argued in the direction of non-additivity in protein-DNA interactions making a way for more complex models to appear which account for nucleotide interactions. We propose to model TFBS representing nucleotide interactions with consistent k-graph Bayesian networks (where k represents the maximum number of interactions between nucleotides) jointly with a set of features, directly scored from each base sequence, which appear to be relevant for TFBS characterization. The model is flexible to incorporate any set of features scored from base sequences. We consider discriminative learning of such models since it outperforms generative learning in the context of classification with a large set of features.

Location(s)

INESC-ID

Portugal

» Array Array

kdbio

Navigation

Motif representation and discovery

Location(s)

User login

Syndicate

Navigation Content Events Reading Groups Seminars Templates & Logos Forums Recent posts Create content	Motif representation and discovery Submitted by smadeira on Thu, 03/11/2010 - 20:57. Start: 01/08/2010 - 14:00 End: 01/08/2010 - 15:00 Timezone: Etc/GMT An important part of gene regulation is mediated by specific proteins, called transcription factors (TF), which influence the transcription of a particular gene by binding to specific sites on DNA sequences, called transcription factor binding sites (TFBS). Such binding sites are relatively short stretches of DNA, normally 5 to 25 nucleotides long. A commonly used representation of TFBS is a position specific scoring matrices (PSSM) which assumes independence of nucleotides in the binding sites. Recently, some works argued in the direction of non-additivity in protein-DNA interactions making a way for more complex models to appear which account for nucleotide interactions. We propose to model TFBS representing nucleotide interactions with consistent k-graph Bayesian networks (where k represents the maximum number of interactions between nucleotides) jointly with a set of features, directly scored from each base sequence, which appear to be relevant for TFBS characterization. The model is flexible to incorporate any set of features scored from base sequences. We consider discriminative learning of such models since it outperforms generative learning in the context of classification with a large set of features. Location(s) INESC-ID Portugal » Array Array	User login Username: * Password: * Create new account Request new password Syndicate



© 2005, Inesc-ID. All rights reserved