Kento MATSUMOTO Sunao HARA Masanobu ABE
In this paper, we propose a new algorithm to generate Speech-like Emotional Sound (SES). Emotional expressions may be the most important factor in human communication, and speech is one of the most useful means of expressing emotions. Although speech generally conveys both emotional and linguistic information, we have undertaken the challenge of generating sounds that convey emotional information alone. We call the generated sounds “speech-like,” because the sounds do not contain any linguistic information. SES can provide another way to generate emotional response in human-computer interaction systems. To generate “speech-like” sound, we propose employing WaveNet as a sound generator conditioned only by emotional IDs. This concept is quite different from the WaveNet Vocoder, which synthesizes speech using spectrum information as an auxiliary feature. The biggest advantage of our approach is that it reduces the amount of emotional speech data necessary for training by focusing on non-linguistic information. The proposed algorithm consists of two steps. In the first step, to generate a variety of spectrum patterns that resemble human speech as closely as possible, WaveNet is trained with auxiliary mel-spectrum parameters and Emotion ID using a large amount of neutral speech. In the second step, to generate emotional expressions, WaveNet is retrained with auxiliary Emotion ID only using a small amount of emotional speech. Experimental results reveal the following: (1) the two-step training is necessary to generate the SES with high quality, and (2) it is important that the training use a large neutral speech database and spectrum information in the first step to improve the emotional expression and naturalness of SES.
Hiroshi FUJIWARA Kanaho HANJI Hiroaki YAMAMOTO
In the online removable knapsack problem, a sequence of items, each labeled with its value and its size, is given one by one. At each arrival of an item, a player has to decide whether to put it into a knapsack or to discard it. The player is also allowed to discard some of the items that are already in the knapsack. The objective is to maximize the total value of the knapsack. Iwama and Taketomi gave an optimal algorithm for the case where the value of each item is equal to its size. In this paper we consider a case with an additional constraint that the capacity of the knapsack is a positive integer N and that the sizes of items are all integral. For each positive integer N, we design an algorithm and prove its optimality. It is revealed that the competitive ratio is not monotonic with respect to N.
Nobuyuki SUGIO Yasutaka IGARASHI Sadayuki HONGO
Integral cryptanalysis is one of the most powerful attacks on symmetric key block ciphers. Attackers preliminarily search integral characteristics of a target cipher and use them to perform the key recovery attack. Todo proposed a novel technique named the bit-based division property to find integral characteristics. Xiang et al. extended the Mixed Integer Linear Programming (MILP) method to search integral characteristics of lightweight block ciphers based on the bit-based division property. In this paper, we apply these techniques to the symmetric key block cipher KASUMI which was developed by modifying MISTY1. As a result, we found new 4.5-round characteristics of KASUMI for the first time. We show that 7-round KASUMI is attackable with 263 data and 2120 encryptions.
Bima PRIHASTO Tzu-Chiang TAI Pao-Chi CHANG Jia-Ching WANG
The recurrent neural network (RNN) has been used in audio and speech processing, such as language translation and speech recognition. Although RNN-based architecture can be applied to speech synthesis, the long computing time is still the primary concern. This research proposes a fast gated recurrent neural network, a fast RNN-based architecture, for speech synthesis based on the minimal gated unit (MGU). Our architecture removes the unit state history from some equations in MGU. Our MGU-based architecture is about twice faster, with equally good sound quality than the other MGU-based architectures.
Tetsutaro YAMADA Masato GOCHO Kei AKAMA Ryoma YATAKA Hiroshi KAMEDA
A new approach for multi-target tracking in an occlusion environment is presented. In pedestrian tracking using a video camera, pedestrains must be tracked accurately and continuously in the images. However, in a crowded environment, the conventional tracking algorithm has a problem in that tracks do not continue when pedestrians are hidden behind the foreground object. In this study, we propose a robust tracking method for occlusion that introduces a degeneration hypothesis that relaxes the track hypothesis which has one measurement to one track constraint. The proposed method relaxes the hypothesis that one measurement and multiple trajectories are associated based on the endpoints of the bounding box when the predicted trajectory is approaching, therefore the continuation of the tracking is improved using the measurement in the foreground. A numerical evaluation using MOT (Multiple Object Tracking) image data sets is performed to demonstrate the effectiveness of the proposed algorithm.
Naoto MATSUO Kazuki YOSHIDA Koji SUMITOMO Kazushige YAMANA Tetsuo TABEI
This paper reports on the ambipolar conduction for the λ-Deoxyribonucleic Acid (DNA) field effect transistor (FET) with 450, 400 and 250 base pair experimentally and theoretically. It was found that the drain current of the p-type DNA/Si FET increased as the ratio of the guanine-cytosine (GC) pair increased and that of the n-type DNA/Si FET decreased as the ratio of the adenine-thymine (AT) pair decreased, and the ratio of the GC pair and AT pair was controlled by the total number of the base pair. In addition, it was found that the hole conduction mechanism of the 400 bp DNA/Si FET was polaron hopping and its activation energy was 0.13eV. By considering the electron affinity of the adenine, thymine, guanine, and cytosine, the ambipolar characteristics of the DNA/Si FET was understood. The holes are injected to the guanine base for the negative gate voltage, and the electrons are injected to the adenine, thymine, and cytosine for the positive gate voltage.
Kazuto FUKUCHI Chia-Mu YU Jun SAKUMA
We investigate a problem of finding the minimum, in which each user has a real value, and we want to estimate the minimum of these values under the local differential privacy constraint. We reveal that this problem is fundamentally difficult, and we cannot construct a consistent mechanism in the worst case. Instead of considering the worst case, we aim to construct a private mechanism whose error rate is adaptive to the easiness of estimation of the minimum. As a measure of easiness, we introduce a parameter α that characterizes the fatness of the minimum-side tail of the user data distribution. As a result, we reveal that the mechanism can achieve O((ln6N/ε2N)1/2α) error without knowledge of α and the error rate is near-optimal in the sense that any mechanism incurs Ω((1/ε2N)1/2α) error. Furthermore, we demonstrate that our mechanism outperforms a naive mechanism by empirical evaluations on synthetic datasets. Also, we conducted experiments on the MovieLens dataset and a purchase history dataset and demonstrate that our algorithm achieves Õ((1/N)1/2α) error adaptively to α.
Kyungmin KIM Jiung SONG Jong Wook KWAK
We propose a novel synthetic-benchmarks generation model using partial time-series regression, called Partial-Regression-Integrated Generic Model (PRIGM). PRIGM abstracts the unique characteristics of the input sensor data into generic time-series data confirming the generation similarity and evaluating the correctness of the synthetic benchmarks. The experimental results obtained by the proposed model with its formula verify that PRIGM preserves the time-series characteristics of empirical data in complex time-series data within 10.4% on an average difference in terms of descriptive statistics accuracy.
At present, the application of different types of memristors in electronics is being deeply studied. Given the nonlinearity characterizing memristors, a circuit with memristors cannot be treated by classical circuit analysis. In this paper, memristor is equivalent to a nonlinear dynamic system composed of linear dynamic system and nonlinear static system by Volterra series. The nonlinear transfer function of memristor is derived. In the complex frequency domain, the n-order complex frequency response of memristor is established by multiple Laplace transform, and the response of MLC parallel circuit is taken as an example to verify. Theoretical analysis shows that the complex frequency domain analysis method of memristor transforms the problem of solving nonlinear circuit in time domain into n times complex frequency domain analysis of linear circuit, which provides an idea for nonlinear dynamic system analysis.
Yanjiang LIU Xianzhao XIA Jingxin ZHONG Pengfei GUO Chunsheng ZHU Zibin DAI
Side-channel analysis is one of the most investigated hardware Trojan detection approaches. However, nearly all the side-channel analysis approaches require golden chips for reference, which are hard to obtain actually. Besides, majority of existing Trojan detection algorithms focus on the data similarity and ignore the Trojan misclassification during the detection. In this paper, we propose a cost-sensitive golden chip-free hardware Trojan detection framework, which aims to minimize the probability of Trojan misclassification during the detection. The post-layout simulation data of voltage variations at different process corners is utilized as a golden reference. Further, a classification algorithm based on the combination of principal component analysis and Naïve bayes is exploited to identify the existence of hardware Trojan with a minimum misclassification risk. Experimental results on ASIC demonstrate that the proposed approach improves the detection accuracy ratio compared with the three detection algorithms and distinguishes the Trojan with only 0.27% area occupies even under ±15% process variations.
Takumi NISHIME Hiroshi HASHIGUCHI Naobumi MICHISHITA Hisashi MORISHITA
Platform-mounted small antennas increase dielectric loss and conductive loss and decrease the radiation efficiency. This paper proposes a novel antenna design method to improve radiation efficiency for platform-mounted small antennas by characteristic mode analysis. The proposed method uses mapping of modal weighting coefficient (MWC) and infinitesimal dipole and evaluate the metal casing with 100mm × 55mm × 23mm as a platform excited by an inverted-F antenna. The simulation and measurement results show that the radiation efficiency of 5% is improved with the whole system from 2.5% of the single antenna.
Da LI Yuanyuan WANG Rikuya YAMAMOTO Yukiko KAWAI Kazutoshi SUMIYA
Recently, machine learning approaches and user movement history analysis on mobile devices have attracted much attention. Generally, we need to apply text data into the word embedding tool for acquiring word vectors as the preprocessing of machine learning approaches. However, it is difficult for mobile devices to afford the huge cost of high-dimensional vector calculation. Thus, a low-cost user behavior and user movement history analysis approach should be considered. To address this issue, firstly, we convert the zip code and street house number into vectors instead of textual address information to reduce the cost of spatial vector calculation. Secondly, we propose a low-cost high-performance semantic and physical distance (real distance) calculation method that applied zip-code-based vectors. Finally, to verify the validity of our proposed method, we utilize the US zip code data to calculate both semantic and physical distances and compare their results with the previous method. The experimental results showed that our proposed method could significantly improve the performance of distance calculation and effectively control the cost to a low level.
Sejin JUNG Eui-Sub KIM Junbeom YOO
Traditional safety analysis techniques have shown difficulties in incorporating dynamically changing structures of CPSs (Cyber-Physical Systems). STPA (System-Theoretic Process Analysis), one of the widely used, needs to unfold and arrange all hidden structures before beginning a full-fledged analysis. This paper proposes an intermediate model “Information Unfolding Model (IUM)” and a process “Information Unfolding Process (IUP)” to unfold dynamic structures which are hidden in CPSs and so help analysts construct control structures in STPA thoroughly.
Shinpei HAYASHI Keisuke ASANO Motoshi SAEKI
Goal refinement is a crucial step in goal-oriented requirements analysis to create a goal model of high quality. Poor goal refinement leads to missing requirements and eliciting incorrect requirements as well as less comprehensiveness of produced goal models. This paper proposes a technique to automate detecting bad smells of goal refinement, symptoms of poor goal refinement. At first, to clarify bad smells, we asked subjects to discover poor goal refinement concretely. Based on the classification of the specified poor refinement, we defined four types of bad smells of goal refinement: Low Semantic Relation, Many Siblings, Few Siblings, and Coarse Grained Leaf, and developed two types of measures to detect them: measures on the graph structure of a goal model and semantic similarity of goal descriptions. We have implemented a supporting tool to detect bad smells and assessed its usefulness by an experiment.
Fei ZHANG Peining ZHEN Dishan JING Xiaotang TANG Hai-Bao CHEN Jie YAN
Intrusion is one of major security issues of internet with the rapid growth in smart and Internet of Thing (IoT) devices, and it becomes important to detect attacks and set out alarm in IoT systems. In this paper, the support vector machine (SVM) and principal component analysis (PCA) based method is used to detect attacks in smart IoT systems. SVM with nonlinear scheme is used for intrusion classification and PCA is adopted for feature selection on the training and testing datasets. Experiments on the NSL-KDD dataset show that the test accuracy of the proposed method can reach 82.2% with 16 features selected from PCA for binary-classification which is almost the same as the result obtained with all the 41 features; and the test accuracy can achieve 78.3% with 29 features selected from PCA for multi-classification while 79.6% without feature selection. The Denial of Service (DoS) attack detection accuracy of the proposed method can achieve 8.8% improvement compared with existing artificial neural network based method.
Masayoshi YAMAMOTO Shinya SHIRAI Senanayake THILAK Jun IMAOKA Ryosuke ISHIDO Yuta OKAWAUCHI Ken NAKAHARA
In response to fast charging systems, Silicon Carbide (SiC) power semiconductor devices are of great interest of the automotive power electronics applications as the next generation of fast charging systems require high voltage batteries. For high voltage battery EVs (Electric Vehicles) over 800V, SiC power semiconductor devices are suitable for 3-phase inverters, battery chargers, and isolated DC-DC converters due to their high voltage rating and high efficiency performance. However, SiC-MOSFETs have two characteristics that interfere with high-speed switching and high efficiency performance operations for SiC MOS-FET applications in automotive power electronics systems. One characteristic is the low voltage rating of the gate-source terminal, and the other is the large internal gate-resistance of SiC MOS-FET. The purpose of this work was to evaluate a proposed hybrid gate drive circuit that could ignore the internal gate-resistance and maintain the gate-source terminal stability of the SiC-MOSFET applications. It has been found that the proposed hybrid gate drive circuit can achieve faster and lower loss switching performance than conventional gate drive circuits by using the current source gate drive characteristics. In addition, the proposed gate drive circuit can use the voltage source gate drive characteristics to protect the gate-source terminals despite the low voltage rating of the SiC MOS-FET gate-source terminals.
Yuya HOSODA Arata KAWAMURA Youji IIGUNI
The narrow bandwidth limitation of 300-3400Hz on the public switching telephone network results in speech quality deterioration. In this paper, we propose an artificial bandwidth extension approach that reconstructs the missing lower bandwidth of 50-300Hz using sinusoidal synthesis based on the first formant location. Sinusoidal synthesis generates sinusoidal waves with a harmonic structure. The proposed method detects the fundamental frequency using an autocorrelation method based on YIN algorithm, where a threshold processing avoids the false fundamental frequency detection on unvoiced sounds. The amplitude of the sinusoidal waves is calculated in the time domain from the weighted energy of 300-600Hz. In this case, since the first formant location corresponds to the first peak of the spectral envelope, we reconstruct the harmonic structure to avoid attenuating and overemphasizing by increasing the weight when the first formant location is lower, and vice versa. Consequently, the subjective and objective evaluations show that the proposed method reduces the speech quality difference between the original speech signal and the bandwidth extended speech signal.
Xiao HONG Yuehong GAO Hongwen YANG
Computer networks tend to be subjected to the proliferation of mobile demands, therefore it poses a great challenge to guarantee the quality of network service. For real-time systems, the QoS performance bound analysis for the complex network topology and background traffic in modern networks is often difficult. Network calculus, nevertheless, converts a complex non-linear network system into an analyzable linear system to accomplish more accurate delay bound analysis. The existing network environment contains complex network resource allocation schemes, and delay bound analysis is generally pessimistic, hence it is essential to modify the analysis model to improve the bound accuracy. In this paper, the main research approach is to obtain the measurement results of an actual network by building a measurement environment and the corresponding theoretical results by network calculus. A comparison between measurement data and theoretical results is made for the purpose of clarifying the scheme of bandwidth scheduling. The measurement results and theoretical analysis results are verified and corrected, in order to propose an accurate per-flow end-to-end delay bound analytic model for a large-scale scheduling network. On this basis, the instructional significance of the analysis results for the engineering construction is discussed.
Hiroya YAMAMOTO Daichi KITAHARA Hiroki KURODA Akira HIRABAYASHI
This paper addresses single image super-resolution (SR) based on convolutional neural networks (CNNs). It is known that recovery of high-frequency components in output SR images of CNNs learned by the least square errors or least absolute errors is insufficient. To generate realistic high-frequency components, SR methods using generative adversarial networks (GANs), composed of one generator and one discriminator, are developed. However, when the generator tries to induce the discriminator's misjudgment, not only realistic high-frequency components but also some artifacts are generated, and objective indices such as PSNR decrease. To reduce the artifacts in the GAN-based SR methods, we consider the set of all SR images whose square errors between downscaling results and the input image are within a certain range, and propose to apply the metric projection onto this consistent set in the output layers of the generators. The proposed technique guarantees the consistency between output SR images and input images, and the generators with the proposed projection can generate high-frequency components with few artifacts while keeping low-frequency ones as appropriate for the known noise level. Numerical experiments show that the proposed technique reduces artifacts included in the original SR images of a GAN-based SR method while generating realistic high-frequency components with better PSNR values in both noise-free and noisy situations. Since the proposed technique can be integrated into various generators if the downscaling process is known, we can give the consistency to existing methods with the input images without degrading other SR performance.
Kosuke TODA Naomi KUZE Toshimitsu USHIO
To maintain blockchain-based services with ensuring its security, it is an important issue how to decide a mining reward so that the number of miners participating in the mining increases. We propose a dynamical model of decision-making for miners using an evolutionary game approach and analyze the stability of equilibrium points of the proposed model. The proposed model is described by the 1st-order differential equation. So, it is simple but its theoretical analysis gives an insight into the characteristics of the decision-making. Through the analysis of the equilibrium points, we show the transcritical bifurcations and hysteresis phenomena of the equilibrium points. We also design a controller that determines the mining reward based on the number of participating miners to stabilize the state where all miners participate in the mining. Numerical simulation shows that there is a trade-off in the choice of the design parameters.