Discriminative discovery of transcription factor binding sites from location data

Yuji Kawada, Yasubumi Sakakibara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)


Motivation: The availability of genome-wide location analyses based on chromatin immunoprecipitation (ChIP) data gives a new insight for in silico analysis of transcriptional regulations. Results: We propose a novel discriminative discovery framework for precisely identifying transcriptional regulatory motifs from both positive and negative samples (sets of upstream sequences of both bound and unbound genes by a transcription factor (TF)) based on the genome-wide location data. In this framework, our goal is to find such discriminative motifs that best explain the location data in the sense that the motifs precisely discriminate the positive samples from the negative ones. First, in order to discover an initial set of discriminative substrings between positive and negative samples, we apply a decision tree learning method which produces a text-classification tree. We extract several clusters consisting of similar substrings from the internal nodes of the learned tree. Second, we start with initial profile-HMMs constructed from each cluster for representing putative motifs and iteratively refine the profile-HMMs to improve the discrimination accuracies. Our genome-wide experimental results on yeast show that our method successfully identifies the consensus sequences for known TFs in the literature and further presents significant performances for discriminating between positive and negative samples in all the TFs, while most other motif detecting methods show very poor performances on the problem of discriminations. Our learned profile-HMMs also improve false negative predictions of ChIP data.

Original languageEnglish
Title of host publicationProceedings - 2005 IEEE Computational SystemsBioinformatics Conference, CSB 2005
Number of pages7
Publication statusPublished - 2005 Dec 1
Event2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005 - Stanford, CA, United States
Duration: 2005 Aug 82005 Aug 11

Publication series

NameProceedings - 2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005


Other2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005
Country/TerritoryUnited States
CityStanford, CA

ASJC Scopus subject areas

  • General Engineering
  • General Medicine


Dive into the research topics of 'Discriminative discovery of transcription factor binding sites from location data'. Together they form a unique fingerprint.

Cite this