In industry, automatic speech recognition has come to be a competitive feature for embedded products with poor hardware resources. In this work, we propose a tiny end-to-end speech recognition model that is lightweight and easily deployable on edge platforms. First, instead of sophisticated network structures, such as recurrent neural networks, transformers, etc., the model we propose mainly uses convolutional neural networks as its backbone. This ensures that our model is supported by most software development kits for embedded devices. Second, we adopt the basic unit of MobileNet-v3, which performs well in computer vision tasks, and integrate the features of the hidden layer at different scales, thus compressing the number of parameters of the model to less than 1 M and achieving an accuracy greater than that of some traditional models. Third, in order to further reduce the CPU computation, we directly extract acoustic representations from 1-dimensional speech waveforms and use a self-supervised learning approach to encourage the convergence of the model. Finally, to solve some problems where hardware resources are relatively weak, we use a prefix beam search decoder to dynamically extend the search path with an optimized pruning strategy and an additional initialism language model to capture the probability of between-words in advance and thus avoid premature pruning of correct words. In our experiments, according to a number of evaluation categories, our end-to-end model outperformed several tiny speech recognition models used for embedded devices in related work.
For dichromats to receive the information represented in color images, it is important to study contrast improvement methods and quantitative evaluation indices of color conversion results. There is an index to evaluate the degree of contrast improvement and in this index, the contrast for dichromacy caused by the lightness component is given importance. In addition, random sampling was introduced in the computation of this index. Although the validity of the index has been shown through comparison with a subjective evaluation, it is considered that the following two points should be examined. First, should contrast for normal trichromacy caused by the lightness component also be attached importance. Second, the influence of random sampling should be examined in detail. In this paper, a new index is proposed and the above-mentioned points are examined. For the first point, the following is revealed through experiment. Consideration of the contrast for normal trichromacy caused by a lightness component that is the same as that for dichromacy may or may not result in a good outcome. The evaluation performance of the proposed index is equivalent to that of the previous index overall. It can be said that the proposed index is superior to the previous one in terms of the unity of evaluating contrast. For the second point, the computation time and the evaluation of significant digits are shown. In this paper, a sampling number such that the number of significant digits can be considered as three is used. In this case, the variation caused by random sampling is negligible compared with the range of the proposed index, whereas the computation time is about one-seventh that when the sampling is not adopted.
Sakyo HASHIMOTO Keigo TAKEUCHI
This letter simplifies and analyze existing state evolution recursions for conjugate gradient. The proposed simplification reduces the complexity for solving the recursions from cubic order to square order in the total number of iterations. The simplified recursions are still catastrophically sensitive to numerical errors, so that arbitrary-precision arithmetic is used for accurate evaluation of the recursions.
Intrinsic Josephson junctions (IJJs) in the high-Tc cuprate superconductors have several fascinating properties, which are superior to the usual Josephson junctions obtained from conventional superconductors with low Tc, as follows; (1) a very thin thickness of the superconducting layers, (2) a strong interaction between junctions since neighboring junctions are closely connected in an atomic scale, (3) a clean interface between the superconducting and insulating layers, realized in a single crystal with few disorders. These unique properties of IJJs can enlarge the applicable areas of the superconducting qubits, not only the increase of qubit-operation temperature but the novel application of qubits including the macroscopic quantum states with internal degree of freedom. I present a comprehensive review of the phase dynamics in current-biased IJJs and argue the challenges of superconducting qubits utilizing IJJs.
Ken NAKAMURA Yuya OMORI Daisuke KOBAYASHI Koyo NITTA Kimikazu SANO Masayuki SATO Hiroe IWASAKI Hiroaki KOBAYASHI
This paper proposes an efficient reference image sharing method for the image-division parallel video encoding architecture. This method efficiently reduces the amount of data transfer by using pre-transfer with area prediction and on-demand transfer with a transfer management table. Experimental results show that the data transfer can be reduced to 19.8-35.3% of the conventional method on average without major degradation of coding performance. This makes it possible to reduce the required bandwidth of the inter-chip transfer interface by saving the amount of data transfer.
Toshiyuki MIYAMOTO Marika IZAWA
Event structures are a well-known modeling formalism for concurrent systems with causality and conflict relations. The flow event structure (FES) is a variant of event structures, which is a generalization of the prime event structure. In an FES, two events may be in conflict even though they are not syntactically in conflict; this is called a semantic conflict. The existence of semantic conflict in an FES motivates reducing conflict relations (i.e., conflict reduction) to obtain a simpler structure. In this paper, we study conflict reduction in acyclic FESs. A necessary and sufficient condition for conflict reduction is given; algorithms to compute semantic conflict, local configurations, and conflict reduction are proposed. A great time reduction was observed in computational experiments when comparing the proposed with the naive method.
Fengde JIA Jihong TAN Xiaochen LU Junhui QIAN
Short-range ambiguous clutter can seriously affect the performance of airborne radar target detection when detecting long-range targets. In this letter, a multiple-input-multiple-output (MIMO) array structure elevation filter (EF) is designed to suppress short-range clutter (SRC). The sidelobe level value in the short-range clutter region is taken as the objective function to construct the optimization problem and the optimal EF weight vector can be obtained by using the convex optimization tool. The simulation results show that the MIMO system can achieve better range ambiguous clutter suppression than the traditional phased array (PA) system.
Ryusuke IGARASHI Ryo NAKAGAWA Dan OKOCHI Yukio OGAWA Mianxiong DONG Kaoru OTA
Vehicles on the road are expected to connect continuously to the Internet at sufficiently high speeds, e.g., several Mbps or higher, to support multimedia applications. However, even when passing through a well-facilitated city area, Internet access can be unreliable and even disconnected if the travel speed is high. We therefore propose a network path selection technique to meet network throughput requirements. The proposed technique is based on the attractor selection model and enables vehicles to switch the path from a route connecting directly to a cellular network to a relay type through neighboring vehicles for Internet access. We also develop a mechanism that prevents frequent path switching when the performance of all available paths does not meet the requirements. We conduct field evaluations by platooning two vehicles in a real-world driving environment and confirm that the proposed technique maintains the required throughput of up to 7Mbps on average. We also evaluated our proposed technique by extensive computer simulations of up to 6 vehicles in a platoon. The results show that increasing platoon length yields a greater improvement in throughput, and the mechanism we developed decreases the rate of path switching by up to 25%.
Jing LIANG Ke LI Kunjie YU Caitong YUE Yaxin LI Hui SONG
The selection of mutation strategy greatly affects the performance of differential evolution algorithm (DE). For different types of optimization problems, different mutation strategies should be selected. How to choose a suitable mutation strategy for different problems is a challenging task. To deal with this challenge, this paper proposes a novel DE algorithm based on local fitness landscape, called FLIDE. In the proposed method, fitness landscape information is obtained to guide the selection of mutation operators. In this way, different problems can be solved with proper evolutionary mechanisms. Moreover, a population adjustment method is used to balance the search ability and population diversity. On one hand, the diversity of the population in the early stage is enhanced with a relative large population. One the other hand, the computational cost is reduced in the later stage with a relative small population. The evolutionary information is utilized as much as possible to guide the search direction. The proposed method is compared with five popular algorithms on 30 test functions with different characteristics. Experimental results show that the proposed FLIDE is more effective on problems with high dimensions.
Shaorong HU Yuqi ZHANG Yuefei JIN Ziqi DOU
Bus bunching often occurs in public transit system, resulting in a series of problems such as poor punctuality, long waiting time and low service quality. In this paper, we explore the influence of the discrete distribution of traffic operation state on the dynamic evolution of bus bunching. Firstly, we use self-organizing map (SOM) to find the threshold of bus bunching and analyze the factors that affect bus bunching based on GPS data of No. 600 bus line in Xi'an. Then, taking the bus headway as the research index, we construct the bus bunching mechanism model. Finally, a simulation platform is built by MATLAB to examine the trend of headway when various influencing factors show different distribution states along the bus line. In terms of influencing factors, inter vehicle speed, queuing time at intersection and loading time at station are shown to have a significant impact on headway between buses. In terms of the impact of the distribution of crowded road sections on headway, long-distance and concentrated crowded road sections will lead to large interval or bus bunching. When the traffic states along the bus line are randomly distributed among crowded, normal and free, the headway may fluctuate in a large range, which may result in bus bunching, or fluctuate in a small range and remain relatively stable. The headway change curve is determined by the distribution length of each traffic state along the bus line. The research results can help to formulate improvement measures according to traffic operation state for equilibrium bus headway and alleviating bus bunching.
In this paper, we propose a new method for non-dominant limb training. The method is that a learner aims at a motion which is generated by reversing his/her own motion of dominant limb, when he/she tries to train himself/herself for non-dominant limb training. In addition, we designed and developed interface for the new method which can select feedback types. One is an interface using AR and sound, and the other is an interface using AR and vibration. We found that vibration feedback was effective for non-dominant hand training of pitching motion, while sound feedback was not so effective as vibration.
Longjiao ZHAO Yu WANG Jien KATO Yoshiharu ISHIKAWA
Convolutional Neural Networks (CNNs) have recently demonstrated outstanding performance in image retrieval tasks. Local convolutional features extracted by CNNs, in particular, show exceptional capability in discrimination. Recent research in this field has concentrated on pooling methods that incorporate local features into global features and assess the global similarity of two images. However, the pooling methods sacrifice the image's local region information and spatial relationships, which are precisely known as the keys to the robustness against occlusion and viewpoint changes. In this paper, instead of pooling methods, we propose an alternative method based on local similarity, determined by directly using local convolutional features. Specifically, we first define three forms of local similarity tensors (LSTs), which take into account information about local regions as well as spatial relationships between them. We then construct a similarity CNN model (SCNN) based on LSTs to assess the similarity between the query and gallery images. The ideal configuration of our method is sought through thorough experiments from three perspectives: local region size, local region content, and spatial relationships between local regions. The experimental results on a modified open dataset (where query images are limited to occluded ones) confirm that the proposed method outperforms the pooling methods because of robustness enhancement. Furthermore, testing on three public retrieval datasets shows that combining LSTs with conventional pooling methods achieves the best results.
Min Ho KWAK Youngwoo KIM Kangin LEE Jae Young CHOI
This letter proposes a novel lightweight deep learning object detector named LW-YOLOv4-tiny, which incorporates the convolution block feature addition module (CBFAM). The novelty of LW-YOLOv4-tiny is the use of channel-wise convolution and element-wise addition in the CBFAM instead of utilizing the concatenation of different feature maps. The model size and computation requirement are reduced by up to 16.9 Mbytes, 5.4 billion FLOPs (BFLOPS), and 11.3 FPS, which is 31.9%, 22.8%, and 30% smaller and faster than the most recent version of YOLOv4-tiny. From the MSCOCO2017 and PASCAL VOC2012 benchmarks, LW-YOLOv4-tiny achieved 40.2% and 69.3% mAP, respectively.
Noriko YUASA Masahiro YAMAGUCHI Kosuke SHIMA Takanobu OTSUKA
At manufacturing sites, mass customization is expanding along with the increasing variety of customer needs. This situation leads to complications in production planning for the factory manager, and production plans are likely to change suddenly at the manufacturing site. Because such sudden fluctuations in production often occur, it is particularly difficult to optimize the parts supply operations in these production processes. As a solution to such problems, Industry 4.0 has expanded to promote the use of digital technologies at manufacturing sites; however, these solutions can be expensive and time-consuming to introduce. Therefore, not all factory managers are favorable toward introducing digital technology. In this study, we propose a method to support parts supply operations that decreases work stagnation and fluctuation without relying on the experience of workers who supply parts in the various production processes. Furthermore, we constructed a system that is inexpensive and easy to introduce using both LPWA and BLE communications. The purpose of the system is to level out work in in-process logistics. In an experiment, the proposed method was introduced to a manufacturing site, and we compared how the workload of the site's workers changed. The experimental results show that the proposed method is effective for workload leveling in parts supply operations.
Reo ERIGUCHI Noboru KUNIHIRO Koji NUIDA
Ramp secret sharing is a variant of secret sharing which can achieve better information ratio than perfect schemes by allowing some partial information on a secret to leak out. Strongly secure ramp schemes can control the amount of leaked information on the components of a secret. In this paper, we reduce the construction of strongly secure ramp secret sharing for general access structures to a linear algebraic problem. As a result, we show that previous results on strongly secure network coding imply two linear transformation methods to make a given linear ramp scheme strongly secure. They are explicit or provide a deterministic algorithm while the previous methods which work for any linear ramp scheme are non-constructive. In addition, we present a novel application of strongly secure ramp schemes to symmetric PIR in a multi-user setting. Our solution is advantageous over those based on a non-strongly secure scheme in that it reduces the amount of communication between users and servers and also the amount of correlated randomness that servers generate in the setup.
Goki YASUDA Tota SUKO Manabu KOBAYASHI Toshiyasu MATSUSHIMA
In a practical classification problem, there are cases where incorrect labels are included in training data due to label noise. We introduce a classification method in the presence of label noise that idealizes a classification method based on the expectation-maximization (EM) algorithm, and evaluate its performance theoretically. Its performance is asymptotically evaluated by assessing the risk function defined as the Kullback-Leibler divergence between predictive distribution and true distribution. The result of this performance evaluation enables a theoretical evaluation of the most successful performance that the EM-based classification method may achieve.
Fanxin ZENG Xiping HE Zhenyu ZHANG Li YAN
Type-II Z-complementary pairs (ZCPs) play an important role in suppressing asynchronous interference in a wideband wireless communication system where the minimum interfering-signal delay is large. Based on binary Golay complementary pairs (BGCPs) and interleaving technique, new construction for producing Z-optimal Type-II even-length quadriphase ZCPs (EL-QZCPs) is presented, and the resultant pairs have new lengths in the form of 2 × 2α10β26γ (α, β, γ non-negative integers), which are not included in existing known Type-II EL-QZCPs.
Histogram equalization (HE) is the one of the simplest and most effective methods for contrast enhancement. It can automatically define the gray-level mapping function based on the distribution of gray-level included in the image. However, since HE does not use a spatial feature included in the input image, HE fails to produce satisfactory results for broad range of low-contrast images. The differential gray-level histogram (DH), which is contained edge information of the input image, was defined and the differential gray-level histogram equalization (DHE) has been proposed. The DHE shows better enhancement results compared to HE for many kinds of images. In this paper, we propose a generalized histogram equalization (GHE) including HE and DHE. In GHE, the histogram is created using the power of the differential gray-level, which includes the spatial features of the image. In HE, the mean brightness of the enhancement image cannot be controlled. On the other hand, GHE can control the mean brightness of the enhancement image by changing the power, thus, the mean brightness of the input image can be perfectly preserved while maintaining good contrast enhancement.
Ann Jelyn TIEMPO Yong-Jin JEONG
Using third-party intellectual properties (3PIP) has been a norm in IC design development process to meet the time-to-market demand and at the same time minimizing the cost. But this flow introduces a threat, such as hardware trojan, which may compromise the security and trustworthiness of underlying hardware, like disclosing confidential information, impeding normal execution and even permanent damage to the system. In years, different detections methods are explored, from just identifying if the circuit is infected with hardware trojan using conventional methods to applying machine learning where it identifies which nets are most likely are hardware trojans. But the performance is not satisfactory in terms of maximizing the detection rate and minimizing the false positive rate. In this paper, a new hardware trojan detection approach is proposed where gate-level netlist is segmented into regions first before analyzing which nets might be hardware trojans. The segmentation process depends on the nets' connectivity, more specifically by looking on each fanout points. Then, further analysis takes place by means of computing the structural similarity of each segmented region and differentiate hardware trojan nets from normal nets. Experimental results show 100% detection of hardware trojan nets inserted on each benchmark circuits and an overall average of 1.38% of false positive rates which resulted to a higher accuracy with an average of 99.31%.
Baohang ZHANG Haichuan YANG Tao ZHENG Rong-Long WANG Shangce GAO
The equilibrium optimizer (EO) is a novel physics-based meta-heuristic optimization algorithm that is inspired by estimating dynamics and equilibrium states in controlled volume mass balance models. As a stochastic optimization algorithm, EO inevitably produces duplicated solutions, which is wasteful of valuable evaluation opportunities. In addition, an excessive number of duplicated solutions can increase the risk of the algorithm getting trapped in local optima. In this paper, an improved EO algorithm with a bis-population-based non-revisiting (BNR) mechanism is proposed, namely BEO. It aims to eliminate duplicate solutions generated by the population during iterations, thus avoiding wasted evaluation opportunities. Furthermore, when a revisited solution is detected, the BNR mechanism activates its unique archive population learning mechanism to assist the algorithm in generating a high-quality solution using the excellent genes in the historical information, which not only improves the algorithm's population diversity but also helps the algorithm get out of the local optimum dilemma. Experimental findings with the IEEE CEC2017 benchmark demonstrate that the proposed BEO algorithm outperforms other seven representative meta-heuristic optimization techniques, including the original EO algorithm.