Byeonghak KIM Murray LOEW David K. HAN Hanseok KO
To date, many studies have employed clustering for the classification of unlabeled data. Deep separate clustering applies several deep learning models to conventional clustering algorithms to more clearly separate the distribution of the clusters. In this paper, we employ a convolutional autoencoder to learn the features of input images. Following this, k-means clustering is conducted using the encoded layer features learned by the convolutional autoencoder. A center loss function is then added to aggregate the data points into clusters to increase the intra-cluster homogeneity. Finally, we calculate and increase the inter-cluster separability. We combine all loss functions into a single global objective function. Our new deep clustering method surpasses the performance of existing clustering approaches when compared in experiments under the same conditions.
Kyungho CHO Byungha AHN Hanseok KO
While a standard Kalman filter (or α-β filter) is commonly used for target tracking, it is well known that the filter performance is often degraded when the target heavily maneuvers. The usual way to accommodate maneuver is to adaptively adjust the filter gain. Our aim is to reduce the tracking error during substantial maneuvering using a combination of non-traditional "intelligent" algorithms. In particular, we propose an effective gain control using fuzzy rule followed by position error compensation via neural network. A Monte-Carlo simulation is performed for various target paths of representative maneuvers employing the proposed algorithm. The results of the simulation indicate a significant improvement over conventional methods in terms of stability, accuracy, and computational load.
This paper focuses on introducing a highly efficient data structure that effectively captures the multipath phenomenon needed for accurate propagation modeling and fast propagation prediction. We propose a new object representation procedure called circular representation (CR) of microwave masking objects such as buildings, to improve over the conventional vector representation (VR) form in fast ray tracing. The proposed CR encapsulates a building with a circle represented by a center point and radius. In this configuration, the CR essentially functions as the basic building block for higher geometric structures, enhancing the efficiency more than when VR is used alone. Only one CR is needed to represent one building while several wall vectors are required in VR. As a result, a significant computational reduction can be achieved in ray tracing by the proposed method. Our aim is to show CR as a solution to achieving efficiency in data structuring for effective propagation prediction modeling. We show that the computational load is reduced by the proposed method. Further reduction is shown attainable using the hierarchical structure of CR in a deterministic propagation model, undergoing ray tracing. The simulation results indicate that the proposed CR scheme reduces the computational load proportionally to the number of potential scattering objects while its hierarchical structure achieves about 50% of computational load reduction in the hierarchical octree structure.
Hyunhak SHIN Bonhwa KU Wooyoung HONG Hanseok KO
Most conventional research on target motion analysis (TMA) based on least squares (LS) has focused on performing asymptotically unbiased estimation with inaccurate measurements. However, such research may often yield inaccurate estimation results when only a small set of measurement data is used. In this paper, we propose an accurate TMA method even with a small set of bearing measurements. First, a subset of measurements is selected by a random sample consensus (RANSAC) algorithm. Then, LS is applied to the selected subset to estimate target motion. Finally, to increase accuracy, the target motion estimation is refined through a bias compensation algorithm. Simulated results verify the effectiveness of the proposed method.
This paper proposes an intelligent image interpolation method based on Cubic Hermite procedure for improving digital images. Image interpolation has been used to create high-resolution effects in digitized image data, providing sharpness in high frequency image data and smoothness in low frequency image data. Most interpolation techniques proposed in the past are centered on determining pixel values using the relationship between neighboring points. As one of the more prevalent interpolation techniques, Cubic Hermite procedure attains the interpolation with a 3rd order polynomial fit using derivatives of points and adaptive smoothness parameters. Cubic Hermite features many forms of a curved shape, which effectively reduce the problems inherent in interpolations. This paper focuses on a method that intelligently determines the derivatives and adaptive smoothness parameters to effectively contain the interpolation error, achieving significantly improved images. Derivatives are determined by taking a weighted sum of the neighboring points whose weighting function decreases as the intensity difference of neighboring points increases. Smoothness parameter is obtained by training an exemplar image to fit into the Cubic Hermite function such that the interpolation error is minimized at each interpolating point. The simulations indicate that the proposed method achieves improved image results over that of conventional methods in terms of error and image quality performance.
Daehun KIM Bonhwa KU David K. HAN Hanseok KO
In this paper, an algorithm is proposed for license plate recognition (LPR) in video traffic surveillance applications. In an LPR system, the primary steps are license plate detection and character segmentation. However, in practice, false alarms often occur due to images of vehicle parts that are similar in appearance to a license plate or detection rate degradation due to local illumination changes. To alleviate these difficulties, the proposed license plate segmentation employs an adaptive binarization using a superpixel-based local contrast measurement. From the binarization, we apply a set of rules to a sequence of characters in a sub-image region to determine whether it is part of a license plate. This process is effective in reducing false alarms and improving detection rates. Our experimental results demonstrate a significant improvement over conventional methods.
Hyunjin CHO Junseok LIM Bonhwa KU Myoungjun CHEONG Iksu SEO Hanseok KO Wooyoung HONG
Passive SONAR receives a mixed form of signal that is a combination of continuous and discrete line-component spectrum signals. The conventional algorithms, DEMON and LOFAR, respectively target each type of signal, but do not consider the other type of signal also present in the practical environment. Thus when features from two types of signals are presented at the same time, analysis results may cause confusion. In this paper, we propose an integrated analysis algorithm for underwater signals using the modulation spectrogram domain. The proposed domain presents the visual difference between the different types of signal, and therefore can prevent confusion that would otherwise be feasible. Moreover, the proposed algorithm is more efficient than multiband DEMON in terms of computation complexity, while in colored ambient noise environment, it has similar performance to conventional DEMON and LOFAR. We prove the validity of the proposed algorithm through the relevant experiments with synthesized signal and actual underwater recordings.
Jaeyong JU Murray LOEW Bonhwa KU Hanseok KO
This paper presents a method for registering retinal images. Retinal image registration is crucial for the diagnoses and treatments of various eye conditions and diseases such as myopia and diabetic retinopathy. Retinal image registration is challenging because the images have non-uniform contrasts and intensity distributions, as well as having large homogeneous non-vascular regions. This paper provides a new retinal image registration method by effectively combining expectation maximization principal component analysis based mutual information (EMPCA-MI) with salient features. Experimental results show that our method is more efficient and robust than the conventional EMPCA-MI method.
Bayesian combining of confidence measures is proposed for speech recognition. Bayesian combining is achieved by the estimation of joint pdf of confidence feature vector in correct and incorrect hypothesis classes. In addition, the adaptation of a confidence score using the pdf is presented. The proposed methods reduced the classification error rate by 18% from the conventional single feature based confidence scoring method in isolated word Out-of-Vocabulary rejection test.
Yoonjae LEE Seokyeong JEONG Hanseok KO
A residual acoustic echo cancellation method that employs the masking property is proposed to enhance the speech quality of hands-free communication devices in an automobile environment. The conventional masking property is employed for speech enhancement using the masking threshold of the desired clean speech signal. In this Letter, either the near-end speech or residual noise is selected as the desired signal according to the double-talk detector. Then, the residual echo signal is masked by the desired signal (masker). Experiments confirm the effectiveness of the proposed method by deriving the echo return loss enhancement and by examining speech waveforms and spectrograms.
Dubok PARK David K. HAN Hanseok KO
This paper proposes a novel framework for enhancing underwater images captured by optical imaging model and non-local means denoising. The proposed approach adjusts the color balance using biasness correction and the average luminance. Scene visibility is then enhanced based on an underwater optical imaging model. The increase in noise in the enhanced images is alleviated by non-local means (NLM) denoising. The final enhanced images are characterized by improved visibility while retaining color fidelity and reducing noise. The proposed method does not require specialized hardware nor prior knowledge of the underwater environment.
Suwon SHON David K. HAN Jounghoon BEH Hanseok KO
This paper describes a method for estimating Direction Of Arrival (DOA) of multiple sound sources in full azimuth with three microphones. Estimating DOA with paired microphone arrays creates imaginary sound sources because of time delay of arrival (TDOA) being identical between real and imaginary sources. Imaginary sound sources can create chronic problems in multiple Sound Source Localization (SSL), because they can be localized as real sound sources. Our proposed approach is based on the observation that each microphone array creates imaginary sound sources, but the DOA of imaginary sources may be different depending on the orientation of the paired microphone array. With the fact that a real source would always be localized in the same direction regardless of the array orientation, we can suppress the imaginary sound sources by minimum filtering based on Steered Response Power – Phase Transform (SRP-PHAT) method. A set of experiments conducted in a real noisy environment showed that the proposed method was accurate in localizing multiple sound sources.
Yoonjae LEE Kihyeon KIM Jongsung YOON Hanseok KO
A simple and novel residual acoustic echo cancellation method that employs binary masking is proposed to enhance the speech quality of hands-free communication in an automobile environment. In general, the W-disjoint orthogonality assumption is used for blind source separation using multi-microphones. However, in this Letter, it is utilized to mask the residual echo component in the time-frequency domain using a single microphone. The experimental results confirm the effectiveness of the proposed method in terms of the echo return loss enhancement and speech enhancement.
This paper proposes an algorithm that adaptively estimates time-varying noise variance used in Kalman filtering for real-time speech signal enhancement. In the speech signal contaminated by white noise, the spectral components except dominant ones in high frequency band are expected to reflect the noise energy. Our approach is first to find the dominant energy bands over speech spectrum using LPC. We then calculate the average value of the actual spectral components over the high frequency region excluding the dominant energy bands and use it as the noise variance. The resulting noise variance estimate is then applied to Kalman filtering to suppress the background noise. Experimental results indicate that the proposed approach achieves a significant improvement in terms of speech enhancement over those of the conventional Kalman filtering that uses the average noise power over silence interval only. As a refinement of our results, we employ multiple-Kalman filtering with multiple noise models and improve the intelligibility.
This paper concerns recognizing 3-dimensional object using proposed multi-layer block model. In particular, we aim to achieve desirable recognition performance while restricting the computational load to a low level using 3-step feature extraction procedure. An input image is first precisely partitioned into hierarchical layers of blocks in the form of base blocks and overlapping blocks. The hierarchical blocks are merged into a matrix, with which abundant local feature information can be obtained. The local features extracted are then employed by the kernel based support vector machines in tournament for enhanced system recognition performance while keeping it to low dimensional feature space. The simulation results show that the proposed feature extraction method reduces the computational load by over 80% and preserves the stable recognition rate from varying illumination and noise conditions.
This paper describes how the image sequences taken by a stationary video camera may be effectively processed to detect and track moving objects against a stationary background in real-time. Our approach is first to isolate the moving objects in image sequences via a modified adaptive background estimation method and then perform token tracking of multiple objects based on features extracted from the processed image sequences. In feature based multiple object tracking, the most prominent tracking issues are track initialization, data association, occlusions due to traffic congestion, and object maneuvering. While there are limited past works addressing these problems, most relevant tracking systems proposed in the past are independently focused to either "occlusion" or "data association" only. In this paper, we propose the KL-IMMPDA (Kanade Lucas-Interacting Multiple Model Probabilistic Data Association) filtering approach for multiple-object tracking to collectively address the key issues. The proposed method essentially employs optical flow measurements for both detection and track initialization while the KL-IMMPDA filter is used to accept or reject measurements, which belong to other objects. The data association performed by the proposed KL-IMMPDA results in an effective tracking scheme, which is robust to partial occlusions and image clutter of object maneuvering. The simulation results show a significant performance improvement for tracking multi-objects in occlusion and maneuvering, when compared to other conventional trackers such as Kalman filter.
Seongkyu MUN Minkyu SHIN Suwon SHON Wooil KIM David K. HAN Hanseok KO
Recent acoustic event classification research has focused on training suitable filters to represent acoustic events. However, due to limited availability of target event databases and linearity of conventional filters, there is still room for improving performance. By exploiting the non-linear modeling of deep neural networks (DNNs) and their ability to learn beyond pre-trained environments, this letter proposes a DNN-based feature extraction scheme for the classification of acoustic events. The effectiveness and robustness to noise of the proposed method are demonstrated using a database of indoor surveillance environments.
Kyungdeuk KO Jaihyun PARK David K. HAN Hanseok KO
In-class species classification based on animal sounds is a highly challenging task even with the latest deep learning technique applied. The difficulty of distinguishing the species is further compounded when the number of species is large within the same class. This paper presents a novel approach for fine categorization of animal species based on their sounds by using pre-trained CNNs and a new self-attention module well-suited for acoustic signals The proposed method is shown effective as it achieves average species accuracy of 98.37% and the minimum species accuracy of 94.38%, the highest among the competing baselines, which include CNN's without self-attention and CNN's with CBAM, FAM, and CFAM but without pre-training.
Hanseok KO Ilkwang LEE Jihyo LEE David HAN
In this paper, we develop an image-based tracking algorithm of multiple vehicles performing effective detection and tracking of moving objects under adverse environmental conditions. In particular, we employ low cost commercial off-the-shelf IR or CCD image sensor for generating continuous images of multiple moving vehicles. The motion in image sequences is first detected by adaptive background estimation and then tracked by Kalman filtering with the attribute information being updated by data association. Upon applying a modified Retinex procedure as preprocessing to reduce the illumination effects, we proceed with a two-step tracking algorithm. The first step achieves blob grouping and then judicially selects the true targets for tracking using data association through information registration. In the second stage, all blobs detected go through a validation for screening as well as for occlusion reasoning, and those found pertinent to the real object survive to become the 'Object' state for stable tracking. The results of representative tests confirm its effectiveness in vehicle tracking under both daylight and nighttime conditions while resolving occlusions.
Jaeyong JU Taeyup SONG Bonhwa KU Hanseok KO
Key frame based video summarization has emerged as an important task for efficient video data management. This paper proposes a novel technique for key frame extraction based on chaos theory and color information. By applying chaos theory, a large content change between frames becomes more chaos-like and results in a more complex fractal trajectory in phase space. By exploiting the fractality measured in the phase space between frames, it is possible to evaluate inter-frame content changes invariant to effects of fades and illumination change. In addition to this measure, the color histogram-based measure is also used to complement the chaos-based measure which is sensitive to changes of camera /object motion. By comparing the last key frame with the current frame based on the proposed frame difference measure combining these two complementary measures, the key frames are robustly selected even under presence of video fades, changes of illumination, and camera/object motion. The experimental results demonstrate its effectiveness with significant improvement over the conventional method.