1-6hit |
Hyeon-Gyu KIM Woo-Lam KANG Yoon-Joon LEE Myoung-Ho KIM
In this paper, we propose a predicate indexing method which handles equality and inequality tests separately. Our method uses a hash table for the equality test and a balanced binary search tree for the inequality test. Such a separate structure reduces a height of the search tree and the number of comparisons per tree node, as well as the cost for tree rebalancing. We compared our method with the IBS-tree which is one of the popular indexing methods suitable for data stream processing. Our experimental results show that the proposed method provides better insertion and search performances than the IBS-tree.
Hyeon-Gyu KIM Woo-Lam KANG Myoung-Ho KIM
Bursty and out-of-order tuple arrivals complicate the process of determining contents and boundaries of sliding windows. To process windows over such streams efficiently, we need to address two issues regarding fast tuple insertion and disorder control. In this paper, we focus on these issues to process sliding windows efficiently over disordered data streams.
Deokmin HAAM Hyeon-Gyu KIM Myoung-Ho KIM
This paper presents a filtering method for efficient face image retrieval over large volume of face databases. The proposed method employs a new face image descriptor, called a cell-orientation vector (COV). It has a simple form: a 72-dimensional vector of integers from 0 to 8. Despite of its simplicity, it achieves high accuracy and efficiency. Our experimental results show that the proposed method based on COVs provides better performance than a recent approach based on identity-based quantization in terms of both accuracy and efficiency.
Junsu KIM Kyong-Ha LEE Myoung-Ho KIM
With rapid increase of the number of applications as well as the sizes of data, multi-query processing on the MapReduce framework has gained much attention. Meanwhile, there have been much interest in skyline query processing due to its power of multi-criteria decision making and analysis. Recently, there have been attempts to optimize multi-query processing in MapReduce. However, they are not appropriate to process multiple skyline queries efficiently and they also require modifications of the Hadoop internals. In this paper, we propose an efficient method for processing multi-skyline queries with MapReduce without any modification of the Hadoop internals. Through various experiments, we show that our approach outperforms previous studies by orders of magnitude.
Tae-Hyung KWON Hyeon-Gyu KIM Myoung-Ho KIM Jin-Hyun SON
A multiple stream join is one of the most important but high cost operations in ubiquitous streaming services. In this paper, we propose a newly improved and practical algorithm for joining multiple streams called AMJoin, which improves the multiple join performance by guaranteeing the detection of join failures in constant time. To achieve this goal, we first design a new data structure called BiHT (Bit-vector Hash Table) and present the overall behavior of AMJoin in detail. In addition, we show various experimental results and their analyses for clarifying its efficiency and practicability.
Jihwan SONG Deokmin HAAM Yoon-Joon LEE Myoung-Ho KIM
In this paper, we introduce a new sequential pattern, the Interactive User Sequence Pattern (IUSP). This pattern is useful for grouping highly interrelated users in one-way communications such as e-mail, SMS, etc., especially when the communications include many spam users. Also, we propose an efficient algorithm for discovering IUSPs from massive one-way communication logs containing only the following information: senders, receivers, and dates and times. Even though there is a difficulty in that our new sequential pattern violates the Apriori property, the proposed algorithm shows excellent processing performance and low storage cost in experiments on a real dataset.