1-8hit |
Jiafeng MAO Qing YU Kiyoharu AIZAWA
Well annotated dataset is crucial to the training of object detectors. However, the production of finely annotated datasets for object detection tasks is extremely labor-intensive, therefore, cloud sourcing is often used to create datasets, which leads to these datasets tending to contain incorrect annotations such as inaccurate localization bounding boxes. In this study, we highlight a problem of object detection with noisy bounding box annotations and show that these noisy annotations are harmful to the performance of deep neural networks. To solve this problem, we further propose a framework to allow the network to modify the noisy datasets by alternating refinement. The experimental results demonstrate that our proposed framework can significantly alleviate the influences of noise on model performance.
Wei HAN Xiongwei ZHANG Gang MIN Xingyu ZHOU Meng SUN
In this letter, we explore joint optimization of perceptual gain function and deep neural networks (DNNs) for a single-channel speech enhancement task. A DNN architecture is proposed which incorporates the masking properties of the human auditory system to make the residual noise inaudible. This new DNN architecture directly trains a perceptual gain function which is used to estimate the magnitude spectrum of clean speech from noisy speech features. Experimental results demonstrate that the proposed speech enhancement approach can achieve significant improvements over the baselines when tested with TIMIT sentences corrupted by various types of noise, no matter whether the noise conditions are included in the training set or not.
Akimitsu DOI Takao HINAMOTO Wu-Sheng LU
For two-dimensional IIR digital filters described by the Fornasini-Marchesini second model, the problem of jointly optimizing high-order error feedback and realization to minimize the effects of roundoff noise at the filter output subject to l2-scaling constraints is investigated. The problem at hand is converted into an unconstrained optimization problem by using linear-algebraic techniques. The unconstrained optimization problem is then solved iteratively by applying an efficient quasi-Newton algorithm with closed-form formulas for key gradient evaluation. Finally, a numerical example is presented to illustrate the validity and effectiveness of the proposed technique.
Maximizing network lifetime and optimizing aggregate system utility are important but usually conflict goals in wireless multi-hop networks. For the trade-off, we present a matrix game-theoretic cross-layer optimization formulation to jointly maximize the diverse objectives in such networks with network coding. To this end, we introduce a cross-layer formulation of general network utility maximization (NUM) that accommodates routing, scheduling, and stream control from different layers in the coded networks. Specifically, for the scheduling problem and then the objective function involved, we develop a matrix game with the strategy sets of the players corresponding to hyperlink and transmission mode, and design multiple payoffs specific to lifetime and system utility, respectively. In particular, with the inherit merit that matrix game can be solved with mathematical programming, our cross-layer programming formulation actually benefits from both game-based and NUM-based approaches at the same time by cooperating the programming model for the matrix game with that for the other layers in a consistent framework. Finally, our numerical experiments quantitatively exemplify the possible performance trad-offs with respect to the two variants developed on the multiple objectives in question while qualitatively exhibiting the differences between the framework and the other related works.
Youhua FU Wei-Ping ZHU Chen LIU Feng LU Hua-An ZHAO
This paper presents a joint linear processing scheme for two-hop and half-duplex distributed amplify-and-forward (AF) relaying networks with one source, one destination and multiple relays, each having multiple antennas. By using the minimum mean-square error (MMSE) criterion and the Wiener filter principle, the joint relay and destination design with perfect channel state information (CSI) is first formulated as an optimization problem with respect to the relay precoding matrix under the constraint of a total relay transmit power. The constrained optimization with an objective to design the relay block-diagonal matrix is then simplified to an equivalent problem with scalar optimization variables. Next, it is revealed that the scalar-version optimization is convex when the total relay power or the second-hop SNR (signal to noise ratio) is above a certain threshold. The underlying optimization problem, which is non-convex in general, is solved by complementary geometric programming (CGP). The proposed joint relay and destination design with perfect CSI is also extended for practical systems where only the channel mean and covariance matrix are available, leading to a robust processing scheme. Finally, Monte Carlo simulations are undertaken to demonstrate the superior MSE (mean-square error) and SER (symbol error rate) performances of the proposed scheme over the existing relaying method in the case of relatively large second-hop SNR.
Dong YANG Paul DIXON Sadaoki FURUI
This paper proposes a new hybrid method for machine transliteration. Our method is based on combining a newly proposed two-step conditional random field (CRF) method and the well-known joint source channel model (JSCM). The contributions of this paper are as follows: (1) A two-step CRF model for machine transliteration is proposed. The first CRF segments a character string of an input word into chunks and the second one converts each chunk into a character in the target language. (2) A joint optimization method of the two-step CRF model and a fast decoding algorithm are also proposed. Our experiments show that the joint optimization of the two-step CRF model works as well as or even better than the JSCM, and the fast decoding algorithm significantly decreases the decoding time. (3) A rapid development method based on a weighted finite state transducer (WFST) framework for the JSCM is proposed. (4) The combination of the proposed two-step CRF model and JSCM outperforms the state-of-the-art result in terms of top-1 accuracy.
Koichiro BAN Masaaki KATAYAMA Takaya YAMAZATO Akira OGAWA
We study the joint optimization problem of a transmitter with multiple transmit antennas and a receiver with multiple receive antennas in a narrow-band communication system. We discuss the problem of designing a pre-filter at the transmitter, a post-filter at the receiver, and a bit allocation pattern to multiple symbols in the sense of minimizing the average bit error rate. With the optimized filters and the bit allocation, we could realize high efficiency and high data rate in band-limited channels.
Yoshiaki ASAKAWA Preeti RAO Hidetoshi SEKINE
This paper describes modifications to a previously proposed 8-kb/s 4-ms-delay CELP speech coding algorithm with a view to improving the speech quality while maintaining low delay and only moderately increasing complexity. The modifications are intended to improve the effectiveness of interframe pitch lag prediction and the sub-optimality level of the excitation coding to the backward adapted synthesis filter by using delayed decision and joint optimization techniques. Results of subjective listening tests using Japanese speech indicate that the coded speech quality is significantly superior to that of the 8-kb/s VSELP coder which has a 20-ms delay. A method that reduces the computational complexity of closed-loop 3-tap pitch prediction with no perceptible degradation in speech quality is proposed, based on representing the pitch-tap vector as the product of a scalar pitch gain and a normalized shape codevector.