A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.
Woong-Kee LOH
Gachon University
Kyoung-Soo HAN
Sungkyul University
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Woong-Kee LOH, Kyoung-Soo HAN, "A Fast Parallel Algorithm for Indexing Human Genome Sequences" in IEICE TRANSACTIONS on Information,
vol. E97-D, no. 5, pp. 1345-1348, May 2014, doi: 10.1587/transinf.E97.D.1345.
Abstract: A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.E97.D.1345/_p
Copy
@ARTICLE{e97-d_5_1345,
author={Woong-Kee LOH, Kyoung-Soo HAN, },
journal={IEICE TRANSACTIONS on Information},
title={A Fast Parallel Algorithm for Indexing Human Genome Sequences},
year={2014},
volume={E97-D},
number={5},
pages={1345-1348},
abstract={A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.},
keywords={},
doi={10.1587/transinf.E97.D.1345},
ISSN={1745-1361},
month={May},}
Copy
TY - JOUR
TI - A Fast Parallel Algorithm for Indexing Human Genome Sequences
T2 - IEICE TRANSACTIONS on Information
SP - 1345
EP - 1348
AU - Woong-Kee LOH
AU - Kyoung-Soo HAN
PY - 2014
DO - 10.1587/transinf.E97.D.1345
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E97-D
IS - 5
JA - IEICE TRANSACTIONS on Information
Y1 - May 2014
AB - A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.
ER -