1-7hit |
Despite the benefits of the Gustafson-Kessel (GK) clustering algorithm, it becomes computationally inefficient when applied to high-dimensional data. In this letter, a parallel implementation of the GK algorithm on the GPU with CUDA is proposed. Using an optimized matrix multiplication algorithm with fast access to shared memory, the CUDA version achieved a maximum 240-fold speedup over the single-CPU version.
In this paper, we improve the classification performance of categorical data using an Adoptive Hamming Distance. We defined the equivalent categorical values and showed how those categorical values were searched to adopt the distance. The effectiveness of the proposed method was demonstrated using various classification examples.
Hyunki LIM Jaesung LEE Dae-Won KIM
We propose a multi-label feature selection method that considers feature dependencies. The proposed method circumvents the prohibitive computations by using a low-rank approximation method. The empirical results acquired by applying the proposed method to several multi-label datasets demonstrate that its performance is comparable to those of recent multi-label feature selection methods and that it reduces the computation time.
Dae-Won KIM Young-il KIM Doheon LEE Kwang Hyung LEE
In this paper, conventional validity indexes are reviewed and the shortcomings of the fuzzy cluster validation index based on inter-cluster proximity are examined. Based on these considerations, a new cluster validity index is proposed for fuzzy partitions obtained from the fuzzy c-means algorithm. The proposed validity index is defined as the average value of the relative intersections of all possible pairs of fuzzy clusters in the system. It computes the overlap between two fuzzy clusters by considering the intersection of each data point in the overlap. The optimal number of clusters is obtained by minimizing the validity index with respect to c. Experiments in which the proposed validity index and several conventional validity indexes were applied to well known data sets highlight the superior qualities of the proposed index.
Hyunki LIM Jaesung LEE Dae-Won KIM
We propose a new multi-label feature selection method that does not require the multi-label problem to be transformed into a single-label problem. Using quadratic programming, the proposed multi-label feature selection algorithm provides markedly better learning performance than conventional methods.
Bo-Yeong KANG Dae-Won KIM Qing LI
A great deal of research has been made to model the vagueness and uncertainty in information retrieval. One such research is fuzzy ranking models, which have been showing their superior performance in handling the uncertainty involved in the retrieval process. However, these conventional fuzzy ranking models have a limited ability to incorporate the user preference when calculating the rank of documents. To address this issue, in this study we develop a new fuzzy ranking model based on the user preference. Through the experiments on the TREC-2 collection of Wall Street Journal documents, we show that the proposed method outperforms the conventional fuzzy ranking models.
Classification based on predictive association rules (CPAR) is a widely used associative classification method. Despite its efficiency, the analysis results obtained by CPAR will be influenced by missing values in the data sets, and thus it is not always possible to correctly analyze the classification results. In this letter, we improve CPAR to deal with the problem of missing data. The effectiveness of the proposed method is demonstrated using various classification examples.