Author Search Result

[Author] Joon-Hyuk CHANG(24hit)

1-20hit(24hit)

  • A Statistical Model-Based V/UV Decision under Background Noise Environments

    Joon-Hyuk CHANG  Nam Soo KIM  Sanjit K. MITRA  

     
    LETTER-Speech and Hearing

      Vol:
    E87-D No:12
      Page(s):
    2885-2887

    In this letter, we propose an approach to incorporate a statistical model for the voiced/unvoiced (V/UV) speech decision under background noise environments. Our approach consists of splitting the input noisy speech into two separate bands and applying a statistical model for each band. We compute and compare the likelihood ratio (LR) for each band based on the statistical model and estimated noise statistics for the V/UV decision. According to the simulation test, the proposed V/UV decision shows a better performance compared with the selectable mode vocoder (SMV) V/UV decision algorithm, particularly in clean and white noise environments.

  • A Statistical Model-Based Speech Enhancement Using Acoustic Noise Classification for Robust Speech Communication

    Jae-Hun CHOI  Joon-Hyuk CHANG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E95-B No:7
      Page(s):
    2513-2516

    In this paper, we present a speech enhancement technique based on the ambient noise classification that incorporates the Gaussian mixture model (GMM). The principal parameters of the statistical model-based speech enhancement algorithm such as the weighting parameter in the decision-directed (DD) method and the long-term smoothing parameter of the noise estimation, are set according to the classified context to ensure best performance under each noise. For real-time context awareness, the noise classification is performed on a frame-by-frame basis using the GMM with the soft decision framework. The speech absence probability (SAP) is used in detecting the speech absence periods and updating the likelihood of the GMM.

  • A Novel Approach to a Robust a Priori SNR Estimator in Speech Enhancement

    Yun-Sik PARK  Joon-Hyuk CHANG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E90-B No:8
      Page(s):
    2182-2185

    This paper presents a novel approach to single channel speech enhancement in noisy environments. Widely adopted noise reduction techniques based on the spectral subtraction are generally expressed as a spectral gain depending on the signal-to-noise ratio (SNR) [1]-[4]. As the estimation method of the SNR, the well-known decision-directed (DD) estimator of Ephraim and Malah efficiently is known to reduces musical noise in noise frames, but the a priori SNR, which is a crucial parameter of the spectral gain, follows the a posteriori SNR with a delay of one frame in speech frames [5]. Therefore, the noise suppression gain using the delayed a priori SNR, which is estimated by the DD algorithm matches the previous frame rather than the current one, so after noise suppression, this degrades the performance of a noise reduction during abrupt transient parts. To overcome this artifact, we propose a computationally simple but effective speech enhancement technique based on the sigmoid type function to adaptively determine the weighting factor of the DD algorithm. Actually, the proposed approach avoids the delay problem of the a priori SNR while maintaining the advantage of the DD algorithm. The performance of the proposed enhancement algorithm is evaluated by the objective and subjective test under various environments and yields better results compared with the conventional DD scheme based approach.

  • Speech Reinforcement Based on Soft Decision under Far-End Noise Environments

    Jae-Hun CHOI  Woo-Sang PARK  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E92-A No:8
      Page(s):
    2116-2119

    In this letter, we propose a speech reinforcement technique based on soft decision under both the far-end and near-end noise environments. We amplify the estimated clean speech signal at the far-end based on the estimated ambient noise spectrum at the near-end, as opposed to reinforcing the noisy far-end speech signal, so that it can be heard more intelligibly in far-end noisy environments. To obtain an effective reinforcement technique, we adopt the soft decision scheme incorporating a speech absence probability (SAP) in the frequency dependent signal-to-noise ratio (SNR) recovery method where the clean speech spectrum is estimated and the reinforcement gain is inherently derived and modified within the unified framework. Performance of the proposed method is evaluated by a subjective testing under various noisy environments. This is an improvement over previous approaches.

  • A Support Vector Machine-Based Gender Identification Using Speech Signal

    Kye-Hwan LEE  Sang-Ick KANG  Deok-Hwan KIM  Joon-Hyuk CHANG  

     
    LETTER-Fundamental Theories for Communications

      Vol:
    E91-B No:10
      Page(s):
    3326-3329

    We propose an effective voice-based gender identification method using a support vector machine (SVM). The SVM is a binary classification algorithm that classifies two groups by finding the voluntary nonlinear boundary in a feature space and is known to yield high classification performance. In the present work, we compare the identification performance of the SVM with that of a Gaussian mixture model (GMM)-based method using the mel frequency cepstral coefficients (MFCC). A novel approach of incorporating a features fusion scheme based on a combination of the MFCC and the fundamental frequency is proposed with the aim of improving the performance of gender identification. Experimental results demonstrate that the gender identification performance using the SVM is significantly better than that of the GMM-based scheme. Moreover, the performance is substantially improved when the proposed features fusion technique is applied.

  • Acoustic Environment Classification Based on SMV Speech Codec Parameters for Context-Aware Mobile Phone

    Kye-Hwan LEE  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E92-D No:7
      Page(s):
    1491-1495

    In this letter, an acoustic environment classification algorithm based on the 3GPP2 selectable mode vocoder (SMV) is proposed for context-aware mobile phones. Classification of the acoustic environment is performed based on a Gaussian mixture model (GMM) using coding parameters of the SMV extracted directly from the encoding process of the acoustic input data in the mobile phone. Experimental results show that the proposed environment classification algorithm provides superior performance over a conventional method in various acoustic environments.

  • Multiband Vector Quantization Based on Inner Product for Wideband Speech Coding

    Joon-Hyuk CHANG  Sanjit K. MITRA  

     
    LETTER-Speech and Hearing

      Vol:
    E88-D No:11
      Page(s):
    2606-2608

    This paper describes a multiband vector quantization (VQ) technique based on inner product for wideband speech coding at 16 kb/s. Our approach consists of splitting the input speech into two separate bands and then applying an independent coding scheme for each band. A code excited linear prediction (CELP) coder is used in the lower band while a transform based coding strategy is applied in the higher band. The spectral components in the higher frequency band are represented by a set of modulated lapped transform (MLT) coefficients. The higher frequency band is divided into three subbands, and the MLT coefficients construct a vector for each subband. Specifically, for the VQ of these vectors, an inner product-based distance measure is proposed as a new strategy. The proposed 16 kb/s coder with the inner-product based distortion measure achieves better performance than the 48 kb/s ITU-T G.722 in subjective quality tests.

  • A Support Vector Machine-Based Voice Activity Detection Employing Effective Feature Vectors

    Q-Haing JO  Yun-Sik PARK  Kye-Hwan LEE  Joon-Hyuk CHANG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E91-B No:6
      Page(s):
    2090-2093

    In this letter, we propose effective feature vectors to improve the performance of voice activity detection (VAD) employing a support vector machine (SVM), which is known to incorporate an optimized nonlinear decision over two different classes. To extract the effective feature vectors, we present a novel scheme that combines the a posteriori SNR, a priori SNR, and predicted SNR, widely adopted in conventional statistical model-based VAD.

  • Improved Speech-Presence Uncertainty Estimation Based on Spectral Gradient for Global Soft Decision-Based Speech Enhancement

    Jong-Woong KIM  Joon-Hyuk CHANG  Sang Won NAM  Dong Kook KIM  Jong Won SHIN  

     
    LETTER-Speech and Hearing

      Vol:
    E96-A No:10
      Page(s):
    2025-2028

    In this paper, we propose a speech-presence uncertainty estimation to improve the global soft decision-based speech enhancement technique by using the spectral gradient scheme. The conventional soft decision-based speech enhancement technique uses a fixed ratio (Q) of the a priori speech-presence and speech-absence probabilities to derive the speech-absence probability (SAP). However, we attempt to adaptively change Q according to the spectral gradient between the current and past frames as well as the status of the voice activity in the previous two frames. As a result, the distinct values of Q to each frequency in each frame are assigned in order to improve the performance of the SAP by tracking the robust a priori information of the speech-presence in time.

  • Speech Enhancement Based on Adaptive Noise Power Estimation Using Spectral Difference

    Jae-Hun CHOI  Joon-Hyuk CHANG  Dong Kook KIM  Suhyun KIM  

     
    LETTER-Speech and Hearing

      Vol:
    E94-A No:10
      Page(s):
    2031-2034

    In this paper, we propose a spectral difference approach for noise power estimation in speech enhancement. The noise power estimate is given by recursively averaging past spectral power values using a smoothing parameter based on the current observation. The smoothing parameter in time and frequency is adjusted by the spectral difference between consecutive frames that can efficiently characterize noise variation. Specifically, we propose an effective technique based on a sigmoid-type function in order to adaptively determine the smoothing parameter based on the spectral difference. Compared to a conventional method, the proposed noise estimate is computationally efficient and able to effectively follow noise changes under various noise conditions.

  • Improved Global Soft Decision Using Smoothed Global Likelihood Ratio for Speech Enhancement

    Joon-Hyuk CHANG  Dong Seok JEONG  Nam Soo KIM  Sangki KANG  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E90-B No:8
      Page(s):
    2186-2189

    In this letter, we propose an improved global soft decision for noisy speech enhancement. From an investigation of statistical model-based speech enhancement, it is discovered that a global soft decision has a fundamental drawback at the speech tail regions of speech signals. For that reason, we propose a new solution based on a smoothed likelihood ratio for the global soft decision. Performances of the proposed method are evaluated by subjective tests under various environments and show better results compared with the our previous work.

  • Frame Splitting Scheme for Error-Robust Audio Streaming over Packet-Switching Networks

    Jong Kyu KIM  Jung Su KIM  Hwan Sik YUN  Joon-Hyuk CHANG  Nam Soo KIM  

     
    LETTER-Multimedia Systems for Communications

      Vol:
    E91-B No:2
      Page(s):
    677-680

    This letter presents a novel frame splitting scheme for an error-robust audio streaming over packet-switching networks. In our approach to perceptual audio coding, an audio frame is split into several subframes based on the network configuration such that each packet can be decoded independently at the receiver. Through a subjective comparison category rating (CCR) test, it is discovered that our approach enhances the quality of the decoded audio signal under the lossy packet-switching networks environment.

  • Distorted Speech Rejection for Automatic Speech Recognition in Wireless Communication

    Joon-Hyuk CHANG  Nam Soo KIM  

     
    LETTER-Speech and Hearing

      Vol:
    E87-D No:7
      Page(s):
    1978-1981

    This letter introduces a pre-rejection technique for wireless channel distorted speech with application to automatic speech recognition (ASR). Based on analysis of distorted speech signals over a wireless communication channel, we propose a method to reject the channel distorted speech with a small computational load. From a number of simulation results, we can discover that the pre-rejection algorithm enhances the robustness of speech recognition operation.

  • Discriminative Weight Training for Support Vector Machine-Based Speech/Music Classification in 3GPP2 SMV Codec

    Sang-Kyun KIM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E93-A No:1
      Page(s):
    316-319

    In this study, a discriminative weight training is applied to a support vector machine (SVM) based speech/music classification for a 3GPP2 selectable mode vocoder (SMV). In the proposed approach, the speech/music decision rule is derived by the SVM by incorporating optimally weighted features derived from the SMV based on a minimum classification error (MCE) method. This method differs from that of the previous work in that different weights are assigned to each feature of the SMV a novel process. According to the experimental results, the proposed approach is effective for speech/music classification using the SVM.

  • Improvement of SVM-Based Speech/Music Classification Using Adaptive Kernel Technique

    Chungsoo LIM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E95-D No:3
      Page(s):
    888-891

    In this paper, we propose a way to improve the classification performance of support vector machines (SVMs), especially for speech and music frames within a selectable mode vocoder (SMV) framework. A myriad of techniques have been proposed for SVMs, and most of them are employed during the training phase of SVMs. Instead, the proposed algorithm is applied during the test phase and works with existing schemes. The proposed algorithm modifies a kernel parameter in the decision function of SVMs to alter SVM decisions for better classification accuracy based on the previous outputs of SVMs. Since speech and music frames exhibit strong inter-frame correlation, the outputs of SVMs can guide the kernel parameter modification. Our experimental results show that the proposed algorithm has the potential for adaptively tuning classifications of support vector machines for better performance.

  • Speech Enhancement: New Approaches to Soft Decision

    Joon-Hyuk CHANG  Nam Soo KIM  

     
    PAPER-Speech and Hearing

      Vol:
    E84-D No:9
      Page(s):
    1231-1240

    In this paper, we propose new approaches to speech enhancement based on soft decision. In order to enhance the statistical reliability in estimating speech activity, we introduce the concept of a global speech absence probability (GSAP). First, we compute the conventional speech absence probability (SAP) and then modify it according to the newly proposed GSAP. The modification is made in such a way that the SAP has the same value of GSAP in the case of speech absence while it is maintained to its original value when the speech is present. Moreover, for improving the performance of the SAP's at voice tails (transition periods from speech to silence), we revise the SAP's using a hang-over scheme based on the hidden Markov model (HMM). In addition, we suggest a robust noise update algorithm in which the noise power is estimated not only in the periods of speech absence but also during speech activity based on soft decision. Also, for improving the SAP determination and noise update routines, we present a new signal to noise ratio (SNR) concept which is called the predicted SNR in this paper. Moreover, we demonstrate that the discrete cosine transform (DCT) enhances the accuracy of the SAP estimation. A number of tests show that the proposed method which is called the speech enhancement based on soft decision (SESD) algorithm yields better performance than the conventional approaches.

  • Improved Global Soft Decision Incorporating Second-Order Conditional MAP in Speech Enhancement

    Jong-Mo KUM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E93-D No:6
      Page(s):
    1652-1655

    In this paper, we propose a novel method based on the second-order conditional maximum a posteriori (CMAP) to improve the performance of the global soft decision in speech enhancement. The conventional global soft decision scheme is found through investigation to have a disadvantage in that the global speech absence probability (GSAP) in that scheme is adjusted by a fixed parameter, which could be a restrictive assumption in the consecutive occurrences of speech frames. To address this problem, we devise a method to incorporate the second-order CMAP in determining the GSAP, which is clearly different from the previous approach in that not only current observation but also the speech activity decisions of the previous two frames are exploited. Performances of the proposed method are evaluated by a number of tests in various environments and show better results than previous work.

  • Speech Enhancement Based on Data-Driven Residual Gain Estimation

    Yu Gwang JIN  Nam Soo KIM  Joon-Hyuk CHANG  

     
    LETTER-Speech and Hearing

      Vol:
    E94-D No:12
      Page(s):
    2537-2540

    In this letter, we propose a novel speech enhancement algorithm based on data-driven residual gain estimation. The entire system consists of two stages. At the first stage, a conventional speech enhancement algorithm enhances the input signal while estimating several signal-to-noise ratio (SNR)-related parameters. The residual gain, which is estimated by a data-driven method, is applied to further enhance the signal at the second stage. A number of experimental results show that the proposed speech enhancement algorithm outperforms the conventional speech enhancement technique based on soft decision and the data-driven approach using SNR grid look-up table.

  • Efficient Speech Reinforcement Based on Low-Bit-Rate Speech Coding Parameters

    Jae-Hun CHOI  Joon-Hyuk CHANG  Seong-Ro LEE  

     
    LETTER-Speech and Hearing

      Vol:
    E93-A No:9
      Page(s):
    1684-1687

    In this paper, a novel approach to speech reinforcement in a low-bit-rate speech coder under ambient noise environments is proposed. The excitation vector of ambient noise is efficiently obtained at the near-end and then combined with the excitation signal of the far-end for a suitable reinforcement gain within the G.729 CS-ACELP Annex. B framework. For this reason, this can be clearly different from previous approaches in that the present approach does not require an additional arithmetic step such as the discrete Fourier transform (DFT). Experimental results indicate that the proposed method shows better performance than or at least comparable to conventional approaches with a lower computational burden.

  • Online Sparse Volterra System Identification Using Projections onto Weighted l1 Balls

    Tae-Ho JUNG  Jung-Hee KIM  Joon-Hyuk CHANG  Sang Won NAM  

     
    PAPER

      Vol:
    E96-A No:10
      Page(s):
    1980-1983

    In this paper, online sparse Volterra system identification is proposed. For that purpose, the conventional adaptive projection-based algorithm with weighted l1 balls (APWL1) is revisited for nonlinear system identification, whereby the linear-in-parameters nature of Volterra systems is utilized. Compared with sparsity-aware recursive least squares (RLS) based algorithms, requiring higher computational complexity and showing faster convergence and lower steady-state error due to their long memory in time-invariant cases, the proposed approach yields better tracking capability in time-varying cases due to short-term data dependence in updating the weight. Also, when N is the number of sparse Volterra kernels and q is the number of input vectors involved to update the weight, the proposed algorithm requires O(qN) multiplication complexity and O(Nlog 2N) sorting-operation complexity. Furthermore, sparsity-aware least mean-squares and affine projection based algorithms are also tested.

1-20hit(24hit)

FlyerIEICE has prepared a flyer regarding multilingual services. Please use the one in your native language.