Motif Discovery

Continuing from the motif definition and search problem, we now turn on to the more difficult task of discovering a hidden motif in a DNA sequence. Starting from a common problem, the extraction of a hidden motif in a set of DNA sequences we will be discussing the following algorithmic approaches:

a. an exhaustive (brute force) search with exponential complexity

b. an exhaustive approach after data transformation with decreased complexity

c. a greedy approach that speeds up the process with a loss in accuracy

d. a randomized algorithm that combines speed with efficiency

In the context of randomized algorithmic search we will be discussing the Gibbs Sampler as a fundamental approach in motif discovery, which we will try to implement in an exercise for the discovery of a hidden motif in a set of DNA sequences.

Wrapping up the problem of motif finding we will be discussing implementations that combine data from different sources (sequence conservation, gene expression etc) that constitute the state of the art in the field, increasing the accuracy of the obtained results.