Full Text Views
110
The hierarchical Dirichlet process (HDP) can provide a nonparametric prior for a mixture model with grouped data, where mixture components are shared across groups. However, the computational cost is generally very high in terms of both time and space complexity. Therefore, developing a method for fast inference of HDP remains a challenge. In this paper, we assume a symmetric multiprocessing (SMP) cluster, which has been widely used in recent years. To speed up the inference on an SMP cluster, we explore hybrid two-level parallelization of the Chinese restaurant franchise sampling scheme for HDP, especially focusing on the application to topic modeling. The methods we developed, Hybrid-AD-HDP and Hybrid-Diff-AD-HDP, make better use of SMP clusters, resulting in faster HDP inference. While the conventional parallel algorithms with a full message-passing interface does not benefit from using SMP clusters due to higher communication costs, the proposed hybrid parallel algorithms have lower communication costs and make better use of the computational resources.
Tsukasa OMOTO
Kobe University
Koji EGUCHI
Kobe University
Shotaro TORA
NTT Corporation
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Tsukasa OMOTO, Koji EGUCHI, Shotaro TORA, "Hybrid Parallel Inference for Hierarchical Dirichlet Processes" in IEICE TRANSACTIONS on Information,
vol. E97-D, no. 4, pp. 815-820, April 2014, doi: 10.1587/transinf.E97.D.815.
Abstract: The hierarchical Dirichlet process (HDP) can provide a nonparametric prior for a mixture model with grouped data, where mixture components are shared across groups. However, the computational cost is generally very high in terms of both time and space complexity. Therefore, developing a method for fast inference of HDP remains a challenge. In this paper, we assume a symmetric multiprocessing (SMP) cluster, which has been widely used in recent years. To speed up the inference on an SMP cluster, we explore hybrid two-level parallelization of the Chinese restaurant franchise sampling scheme for HDP, especially focusing on the application to topic modeling. The methods we developed, Hybrid-AD-HDP and Hybrid-Diff-AD-HDP, make better use of SMP clusters, resulting in faster HDP inference. While the conventional parallel algorithms with a full message-passing interface does not benefit from using SMP clusters due to higher communication costs, the proposed hybrid parallel algorithms have lower communication costs and make better use of the computational resources.
URL: https://globals.ieice.org/en_transactions/information/10.1587/transinf.E97.D.815/_p
Copy
@ARTICLE{e97-d_4_815,
author={Tsukasa OMOTO, Koji EGUCHI, Shotaro TORA, },
journal={IEICE TRANSACTIONS on Information},
title={Hybrid Parallel Inference for Hierarchical Dirichlet Processes},
year={2014},
volume={E97-D},
number={4},
pages={815-820},
abstract={The hierarchical Dirichlet process (HDP) can provide a nonparametric prior for a mixture model with grouped data, where mixture components are shared across groups. However, the computational cost is generally very high in terms of both time and space complexity. Therefore, developing a method for fast inference of HDP remains a challenge. In this paper, we assume a symmetric multiprocessing (SMP) cluster, which has been widely used in recent years. To speed up the inference on an SMP cluster, we explore hybrid two-level parallelization of the Chinese restaurant franchise sampling scheme for HDP, especially focusing on the application to topic modeling. The methods we developed, Hybrid-AD-HDP and Hybrid-Diff-AD-HDP, make better use of SMP clusters, resulting in faster HDP inference. While the conventional parallel algorithms with a full message-passing interface does not benefit from using SMP clusters due to higher communication costs, the proposed hybrid parallel algorithms have lower communication costs and make better use of the computational resources.},
keywords={},
doi={10.1587/transinf.E97.D.815},
ISSN={1745-1361},
month={April},}
Copy
TY - JOUR
TI - Hybrid Parallel Inference for Hierarchical Dirichlet Processes
T2 - IEICE TRANSACTIONS on Information
SP - 815
EP - 820
AU - Tsukasa OMOTO
AU - Koji EGUCHI
AU - Shotaro TORA
PY - 2014
DO - 10.1587/transinf.E97.D.815
JO - IEICE TRANSACTIONS on Information
SN - 1745-1361
VL - E97-D
IS - 4
JA - IEICE TRANSACTIONS on Information
Y1 - April 2014
AB - The hierarchical Dirichlet process (HDP) can provide a nonparametric prior for a mixture model with grouped data, where mixture components are shared across groups. However, the computational cost is generally very high in terms of both time and space complexity. Therefore, developing a method for fast inference of HDP remains a challenge. In this paper, we assume a symmetric multiprocessing (SMP) cluster, which has been widely used in recent years. To speed up the inference on an SMP cluster, we explore hybrid two-level parallelization of the Chinese restaurant franchise sampling scheme for HDP, especially focusing on the application to topic modeling. The methods we developed, Hybrid-AD-HDP and Hybrid-Diff-AD-HDP, make better use of SMP clusters, resulting in faster HDP inference. While the conventional parallel algorithms with a full message-passing interface does not benefit from using SMP clusters due to higher communication costs, the proposed hybrid parallel algorithms have lower communication costs and make better use of the computational resources.
ER -