Biology has become a data-intensive research field. A large amount of data is from new genome sequencing technologies. The motif recognition problem takes as input a set of known patterns or features that in some way define a class of proteins. The goal is then to search in an unsupervised or supervised way for other instances of the same patterns. The known motifs in biological sequences are generally compiled databases that are publically available over the Internet. The motifs contained in these resources are generally manually curated and the entries in the databases include extensive documentation of the specific biological function associated with the sites. CD ComputaBio is proud to provide the best protein motif discovery services to our clients.

Overall solutions

Given a set of functionally related sequences, the main aim of motif discovery algorithms is to find new and a priori unknown motifs that are frequent, unexpected, or interesting according to some formal criteria. The methods used to discover such motifs follow the same general schema, as shown in Figure 1. They can be grouped into two main categories: alignment‐based methods and methods that search for motifs in unaligned sequences.

General motif discovery process.Figure 1. General motif discovery process.

Our method

  • MEME
  • Expectation maximization        
  • Bayesian methods
  • Gibbs sampling
  • HMMs
  • Combinatorial

Applications of Motif Discovery Services in Protein Sequences

  • Gene function research
  • Human disease
  • Drug design
  • Transcriptional regulatory research

