A Fast Parallel Algorithm for Indexing Human Genome Sequences

Woong-Kee LOH; Kyoung-Soo HAN

doi:10.1587/transinf.E97.D.1345

A Fast Parallel Algorithm for Indexing Human Genome Sequences

Woong-Kee LOH, Kyoung-Soo HAN

Full Text Views

0

Share
Cite this

Summary :

A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.

Publication: IEICE TRANSACTIONS on Information Vol.E97-D No.5 pp.1345-1348

Publication Date: 2014/05/01

Publicized

Online ISSN: 1745-1361

DOI: 10.1587/transinf.E97.D.1345

Type of Manuscript: LETTER

Category: Data Engineering, Web Information Systems

Authors

Woong-Kee LOH
Gachon University
Kyoung-Soo HAN
Sungkyul University

Keyword

human genome sequences, suffix tree, parallel algorithm, suffix array, disk-based index

Cite this

Copy

Woong-Kee LOH, Kyoung-Soo HAN, "A Fast Parallel Algorithm for Indexing Human Genome Sequences" in IEICE TRANSACTIONS on Information, vol. E97-D, no. 5, pp. 1345-1348, May 2014, doi: 10.1587/transinf.E97.D.1345.
Abstract: A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.E97.D.1345/_p

Copy

@ARTICLE{e97-d_5_1345,
author={Woong-Kee LOH, Kyoung-Soo HAN, },
journal={IEICE TRANSACTIONS on Information},
title={A Fast Parallel Algorithm for Indexing Human Genome Sequences},
year={2014},
volume={E97-D},
number={5},
pages={1345-1348},
abstract={A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.},
keywords={},
doi={10.1587/transinf.E97.D.1345},
ISSN={1745-1361},
month={May},}

Copy

TY - JOUR
TI - A Fast Parallel Algorithm for Indexing Human Genome Sequences
T2 - IEICE TRANSACTIONS on Information
SP - 1345
EP - 1348
AU - Woong-Kee LOH
AU - Kyoung-Soo HAN
PY - 2014
DO - 10.1587/transinf.E97.D.1345
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E97-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2014
AB - A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.
ER -