One of the most important factors indicating the effectiveness of a word clustering method is how commonly it can be applied to different languages. This paper evaluates the applicability of a new word clustering method to the English and Japanese languages using word sets edited from technical summaries. The method employs an iterative clustering routine which increases the number of clustered words. Thus, evaluations are achieved as a function of the number of iterations of the clustering routine from the aspects (a) clustering characteristics determined from the number of clustered words, the number of clusters formed, etc., and (b) performance determined from the average clustering ratio and the average cluster uniformity. Consequently, the applicability of the method to English and Japanese is obtained through evaluations indicating similarities between them for both clustering characteristics and performance. It is also clarified that about fifty percent of the target words can be clustered in less than five iterations of the clustering routine.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Copy
Yoshitaka FUJIWARA, "Applicability of Word Clustering to the English and Japanese Languages" in IEICE TRANSACTIONS on transactions,
vol. E72-E, no. 10, pp. 1149-1156, October 1989, doi: .
Abstract: One of the most important factors indicating the effectiveness of a word clustering method is how commonly it can be applied to different languages. This paper evaluates the applicability of a new word clustering method to the English and Japanese languages using word sets edited from technical summaries. The method employs an iterative clustering routine which increases the number of clustered words. Thus, evaluations are achieved as a function of the number of iterations of the clustering routine from the aspects (a) clustering characteristics determined from the number of clustered words, the number of clusters formed, etc., and (b) performance determined from the average clustering ratio and the average cluster uniformity. Consequently, the applicability of the method to English and Japanese is obtained through evaluations indicating similarities between them for both clustering characteristics and performance. It is also clarified that about fifty percent of the target words can be clustered in less than five iterations of the clustering routine.
URL: https://globals.ieice.org/en_transactions/transactions/10.1587/e72-e_10_1149/_p
Copy
@ARTICLE{e72-e_10_1149,
author={Yoshitaka FUJIWARA, },
journal={IEICE TRANSACTIONS on transactions},
title={Applicability of Word Clustering to the English and Japanese Languages},
year={1989},
volume={E72-E},
number={10},
pages={1149-1156},
abstract={One of the most important factors indicating the effectiveness of a word clustering method is how commonly it can be applied to different languages. This paper evaluates the applicability of a new word clustering method to the English and Japanese languages using word sets edited from technical summaries. The method employs an iterative clustering routine which increases the number of clustered words. Thus, evaluations are achieved as a function of the number of iterations of the clustering routine from the aspects (a) clustering characteristics determined from the number of clustered words, the number of clusters formed, etc., and (b) performance determined from the average clustering ratio and the average cluster uniformity. Consequently, the applicability of the method to English and Japanese is obtained through evaluations indicating similarities between them for both clustering characteristics and performance. It is also clarified that about fifty percent of the target words can be clustered in less than five iterations of the clustering routine.},
keywords={},
doi={},
ISSN={},
month={October},}
Copy
TY - JOUR
TI - Applicability of Word Clustering to the English and Japanese Languages
T2 - IEICE TRANSACTIONS on transactions
SP - 1149
EP - 1156
AU - Yoshitaka FUJIWARA
PY - 1989
DO -
JO - IEICE TRANSACTIONS on transactions
SN -
VL - E72-E
IS - 10
JA - IEICE TRANSACTIONS on transactions
Y1 - October 1989
AB - One of the most important factors indicating the effectiveness of a word clustering method is how commonly it can be applied to different languages. This paper evaluates the applicability of a new word clustering method to the English and Japanese languages using word sets edited from technical summaries. The method employs an iterative clustering routine which increases the number of clustered words. Thus, evaluations are achieved as a function of the number of iterations of the clustering routine from the aspects (a) clustering characteristics determined from the number of clustered words, the number of clusters formed, etc., and (b) performance determined from the average clustering ratio and the average cluster uniformity. Consequently, the applicability of the method to English and Japanese is obtained through evaluations indicating similarities between them for both clustering characteristics and performance. It is also clarified that about fifty percent of the target words can be clustered in less than five iterations of the clustering routine.
ER -