Adaptive seeds tame genomic sequence comparison

Szymon M. Kiełbasa, Raymond Wan, Kengo Sato, Paul Horton, Martin C. Frith

研究成果: Article査読

778 被引用数 (Scopus)

抄録

The main way of analyzing biological sequences is by comparing and aligning them to each other. It remains difficult, however, to compare modern multi-billionbase DNA data sets. The difficulty is caused by the nonuniform (oligo)nucleotide composition of these sequences, rather than their size per se. To solve this problem, we modified the standard seed-and-extend approach (e.g., BLAST) to use adaptive seeds. Adaptive seeds are matches that are chosen based on their rareness, instead of using fixed-length matches. This method guarantees that the number of matches, and thus the running time, increases linearly, instead of quadratically, with sequence length. LAST, our open source implementation of adaptive seeds, enables fast and sensitive comparison of large sequences with arbitrarily nonuniform composition.

本文言語English
ページ(範囲)487-493
ページ数7
ジャーナルGenome Research
21
3
DOI
出版ステータスPublished - 2011 3月
外部発表はい

ASJC Scopus subject areas

  • 遺伝学
  • 遺伝学(臨床)

フィンガープリント

「Adaptive seeds tame genomic sequence comparison」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル